Posts Tagged ‘UniBasic’

Statement Code Coverage Testing – Part 2

November 26, 2011 1 comment

Back in November 2009 I posted the “UniBasic Code Coverage” project as an open-source project. Back then it was stripped version based on one I set up for my then employer. The version for my employer used an in-house pre-processor that greatly simplified the work I needed to do for it work with our source files.

I have now released the v0.2 (update: v0.8) development version which has fixed several bugs, added the ability to specific a customer pre-process for those don’t use string UniBasic and provided improved the documentation on installing, using and contributing.

As you will already be aware, the source code for this is hosting on the UniBasic Code Coverage Project at SourceForge in a Subversion repository. If you have Subversion installed, you can checkout the code with the following command:

svn co ucov

If you are running UniData or UniVerse on Windows, I highly recommend you install Tortoise SVN as it greatly simplifies working with Subversion.

On the SourceForge site you will not only find the Subversion repository for all the code, but also ‘Tracker’ which will allow you to submit Feature and Bug tickets. If you need help with anything, you can submit a Support Request as well.

If you wish to contribute to the code or documentation, you can introduce yourself on the Developer Forum. The best way to submit code or doc is by generating a Diff of the changes, as well as what the behaviour was before the change and what it was after the change.

When you have used UBC, be sure to fill out a Review. All constructive input is welcome and appreciated!

Making Life Easier

Every developer worth their salt has little snippets of code that they use to make their life easier.

So today, I thought I’d share a little utility that I use all the time.

Readers, meet RL. RL, meet readers. RL (Run Line) allows you to run a single line of code to see what it does. Typically, I use this as a quick way to check out an OCONV express I haven’t used in a while or as a calc replacement if I don’t feel like tabbing out of the terminal. All you need to do is compile and catalog to enjoy its simpleness. RL supports 1 option, ‘-H’. This option allows you hide the compiler output if you wish.

To use RL, you can either enter the code as part of the command line arguments, or you can enter it as an input. Here is a screenshot of RL in action:

RL Utility

RL Utility


Disclaimer: I strongly recommend against implementing this on a production machine as it will allow arbitrary code execution. This code has only been tested on UniData 7.2.x. Feel free to use this code however you want. If you somehow turn this in to something profitable share the love and either buy me a beer or make a donation to an open source project.

Okay, so enough with the spiel. Here’s the code :
(Updated 2011-06-02 due to ‘<-1>’ being stripped)


EQU TempDir TO "_PH_"

OPEN TempDir TO FH.TempDir ELSE STOP "ERROR: Unable to open {":TempDir:"}"

* Determine if we should hide compiler output
* Also determine the start of any command line 'code'

   HideFlag = @TRUE
   CodeStart = COL2() + 1
   CodeStart = COL1()
   IF CodeStart = 0 THEN
      CodeStart = LEN(@SENTENCE) + 1 ;* Force it to the end
      CodeStart += 1 ;* Skip the Space

   HideFlag = @FALSE

* Get the code from the command line arguments, or
* Get the code from stdin

   Code = @SENTENCE[CodeStart, LEN(@SENTENCE) - CodeStart + 1]
   Code = TRIM(Code, " ", "B")
   PROMPT ''
   CRT "Enter Code: ":
   INPUT Code

* Compile, catalog and run the program
* We only catalog it so that @SENTENCE behaves as you would expect

WRITEU Code TO FH.TempDir, TempProg ON ERROR STOP "ERROR: Unable to write {":TempProg:"}"

Statement = "BASIC ":TempDir:" ":TempProg
Statement<-1> = "CATALOG ":TempDir:" ":TempProg:" FORCE"

IF HideFlag THEN
   EXECUTE Statement


* Clean up time

DecatStatement = "DELETE.CATALOG ":TempProg

DELETE FH.TempDir, "_"TempProg
DELETE FH.TempDir, TempProg


The problem with numbers

October 11, 2010 4 comments

UniData, like other weakly-typed systems, makes some programming tasks easier by not needing the developer to declare and adhere to a data type with variables. The general pros and cons of this have been debated many times across many languages and hence will not discussed here. What will be discussed is specific cases where this can cause you unexpected headaches.

A posting was made on the u2ug forums by Koglan Naicker about some unexpected issues when finding duplicates in a data-set.

In short, he found that when strings contained large numbers, it would sometimes incorrectly evaluate two different strings as equal. For example:

IF '360091600172131297' EQ '360091600172131299' THEN CRT "Equal"

The above code results in “Equal” being displayed on the screen. This is caused by a combination of 2 factors.

The first being that UniData is weakly typed. This means that it does not explicitly distinguish between strings and numbers, but attempts to determine the data type by examining the data. In this case, since the strings are numeric, it automatically treats them as numbers.

The second part of this issue is because now that it is treating those 2 strings as numbers, it needs to handle them in an appropriate data type on the CPU. Since the 2 strings are too large to be treated as an integer, they get converted to a floating-point number. Due to rounding that occurs, this actually results in both of these strings being converted to the same float-point representation! A better method may have been to use something such as Bignum instead of converted to floating-point. There would be a speed trade-off, but surely that would have been better than potentially incorrect programs.

Some people suggest prefixing or appending a non-number character to each string to force them to be treated as a string. Not entirely elegant and can have performance implications. Fortunately, UniData does have proper functions to handle these situations. In the case where you will be comparing strings that may consist of only numeric characters, you should use the SCMP function. This function compares two strings as strings, regardless of the actual data in them. Using this when you need to be certain how the comparison is performed can save you a lot of headaches in the future.

Also of interest is that this issue doesn’t just apply to UniBasic, but can also affect UniQuery!

It should be noted though, this only affects UniQuery when the dictionary item is right-aligned with the format field (eg, 20R in attribute 5).

You can tested this by creating a file and creating 3 records with the @ID of ‘360091600172130474’, ‘360091600172131297’ and ‘360091600172131299’.

Now, select upon the file where the @ID = ‘360091600172131297″ and you can see that 2 records are returned!

Results of selection

Non Unique Select

When explicitly selected a record via a unique key, this isn’t the result a database should return.

So, when dealing with large, potentially numeric fields with UniQuery, you may need 2 dictionary items. A left-aligned one for selecting on and a right-aligned one if you require numerical sorting.

Crouching Null, Hidden Bug

May 2, 2010 1 comment

Null, (actually just an empty string “” in U2) is a valid value. Normally, it would be treated exactly the same as other normal values, such as 1 or “1”, but it isn’t always.

I’ve seen a few bugs that have been created by not understand the differences in how nulls are treated. When debugging the code can look completely valid as well, meaning it takes even longer to identify and rectify the issue.

Okay, lets see how you go. I’ll give you a few series of records, each created with the same data. Your job, is to work out which records in the series will exactly match their ‘CONTROL.REC’. Good luck and try to do it without needing to compile the code!

Series 1:

 CONTROL.REC = "" : @AM : "A"


 REC(0) = ""
 REC(0)<2> = "A"

 REC(1) = ""
 REC(1)<-1> = "A"

 REC(2) = ""

 REC(3) = "A"
 INS "" BEFORE REC(3)<1>

Series 2:

 CONTROL.REC = "A" : @AM : ""


 REC(0) = "A"
 REC(0)<2> = ""

 REC(1) = "A"
 REC(1)<-1> = ""

 REC(2) = "A"
 INS "" BEFORE REC(2)<2>

 REC(3) = ""

Series 3:

 CONTROL.REC = "" : @AM : "A" : @AM : ""


 REC(0) = ""
 REC(0)<2> = "A"
 REC(0)<3> = ""

 REC(1) = ""
 REC(1)<-1> = "A"
 REC(1)<-1> = ""

 REC(2) = ""
 INS "" BEFORE REC(2)<1>

 REC(3) = ""
 INS "" BEFORE REC(3)<3>

Series 4:

 CONTROL.REC = "A" : @AM : "" : @AM : "A"


 REC(0) = "A"
 REC(0)<2> = ""
 REC(0)<3> = "A"

 REC(1) = "A"
 REC(1)<-1> = ""
 REC(1)<-1> = "A"

 REC(2) = "A"
 INS "" BEFORE REC(2)<1>

 REC(3) = "A"
 INS "" BEFORE REC(3)<2>

Series 5:

 CONTROL.REC = "A" : @AM : "" : @AM : ""


 REC(0) = "A"
 REC(0)<2> = ""
 REC(0)<3> = ""

 REC(1) = "A"
 REC(1)<-1> = ""
 REC(1)<-1> = ""

 REC(2) = "A"
 INS "" BEFORE REC(2)<2>
 INS "" BEFORE REC(2)<2>

 REC(3) = "A"
 INS "" BEFORE REC(3)<2>
 INS "" BEFORE REC(3)<3>
 REC(4) = ""
 INS "" BEFORE REC(4)<1>

Series 6:

 CONTROL.REC = "A" : @AM : "B" : @AM : "C"


 REC(0) = "A"
 REC(0)<2> = "B"
 REC(0)<3> = "C"

 REC(1) = "A"
 REC(1)<-1> = "B"
 REC(1)<-1> = "C"

 REC(2) = "A"

 REC(3) = "A"
 REC(4) = "C"

Did you get them all? I’ve put the answers as a comment so you can check them if you want. I’d be surprised if you got them all right…

So, what should you take away from this? <-1> and INS can give you (or others!) a world of headaches if you don’t understand their peculiarities with null values.

Final Note: In UniVerse INS behaviour for some cases is dependant on the flavour your account is running in and the $OPTIONS EXTRA.DELIM setting.

Improving U2 Security

April 11, 2010 3 comments

The general IT knowledge of security has come along way in the last 20 years. Even more dramatically when considering the last 10 years.

People are generally aware that unless due care is taken, their computer could be injected with a virus, have personal information stolen from it or even be used to facilitate crime. Major OS Vendors have picked up their game and now are putting in a better attempt to prevent compromises from the OS level. Sure, you still hear the odd story about the latest privilege escalation, but compared to what it use to be…

Network level security has been given most of the attention (and IT budget funding) and is *generally* fairly secure these days. Application level is where most of the major hacks are happening now, but unfortunately, corporate uptake on securing their systems at the Application level hasn’t been as good as it was with the Networks.

Let’s be honesty and not undersell ourselves, securing complex applications is no mean feat. It takes knowledge, planning, lots of time & patience and sometimes out-of-the-box thinking. Thankfully, most modern programming languages and Database Management Systems do the heavy lifting for us. From the security features built into C# and Java to the vastly improved safety net found in SQL engines with fine-grained access control and in-built functions for preventing SQL injection, a lot of the basics have been solved.

This is where the U2 family has a few gaps to be filled. UniBasic needs some inbuilt functions for sanitisation, UniObjects needs some form of access control built around it and UniQuery/RetrieVe prepared statements/stored procedures would be nice.

With the increase push in integrating U2 servers as databases for modern front-ends such as web applications, data sanitisation is going to become a prevalent topic in the community. Built-in functions for UniQuery/RetrieVe, SQL and HTML sanitisation/encoding would be welcome additions to the UniBasic command family. Even better would be some form of prepared statements for the query languages. This make it simpler and easier to obtain better program security.

UniObjects is touted as a standard method of connecting GUI application front-ends to a U2 back-end. However, due to the limited access control supported by UniObjects, it is a dangerous hole in your system to have the required port open for anything other than back-end servers. Take into considering user ‘X’. User ‘X’ has appropriate login credentials for the old green screen system. IT brings out a new Windows GUI application, lets say for reporting, that runs on the user’s machine and uses UniObjects to connect to U2. In the old green screen system, User ‘X’ was limited to set menus and programs to run and could not get access to ECL/TCL. With enough knowledge (and malice), User ‘X’ can now freely use his green screen login credentials to log into the U2 system via UniObjects read/write records directly and even execute raw ECL/TCL commands.

So what exactly is the problem with UniObjects? Quite simply put, it has no fine-grained server-side control of what actions can be done, or commands issued via UniObjects. As long as you can log in, you can get a free pass to the back-end’s data. Let’s take MsSQL as a counter example. You can create views, stored procedures, grant or deny users a suite of privileges to tables and commands. Essentially, UniData needs to be able to have some access control scheme for UniQuery that allow you to define whether the users and read/write records in certain files. Ideally, all read/writes would be done through U2 UniBasic subroutines, with RPC daemon having the ability to have a command ‘white-list’ setup. That way, all data access can be moderated with UniBasic code and the RPC daemon having a white-list that only allows access to calling those subroutines.

All this highlights an issue we need to overcome as a community. The lack of U2 specific security literature. Where is the UniData/UniVerse security manual? Where is the “Top 10 common security mistakes” for U2? Sadly, security does seem to be an afterthought. Sometimes even a ‘neverthought’.

Security is not Obscurity. Even in U2 [Part 3]

March 1, 2010 1 comment

A few years ago I read an interesting article titled Denial of Service via Algorithmic Complexity Attacks. When I started working with UniData, It never crossed my mind that U2 had the same class of vulnerabilities, but it does.

If you develop for a U2 system where you cannot afford for malicious internal/external entities to adversely affect system performance, then I highly suggest you read the above linked paper.

I’ll divide this into 3 sections.

Hash file vulnerability
Dynamic Array vulnerability

The first place I’ll draw your attention to is the humble hash file at the core of UniData and UniVerse. As you probably know, each record is placed in a group dependant on the hash value of its record ID, along with the modulo and hashing algorithm of the file. Now, there are 2 hashing algorithms that a hashed file can use. Type 0 or ‘GENERAL’ is the default, general use hashing algorithm, whereas Type 1 or ‘SEQ.NUM’ is an alternative you can specify and is designed to handle sequential keys. The hash file is basically a hash table with chaining.

Let’s assume we’re working at the HackMe Ltd company that has made a public website to integrate with their existing backend system, which is UniData driven. It is decided that people can pick their own usernames when signing up. Since these usernames are unique, they have been used as the record ID.

Ever since he was not hired after interviewing at HackMe Ltd, Harry has wanted to show them up. Knowing that they used UniData on the backend from his Interview (and their job ads), he installed UniData and makes some initial guesses at the modulo for their ‘users’ tables and calculates a few usernames sets for different modulus.

Now, by going to their website and taking timings for the “Check username availability” feature, Harry was able to become reasonably sure of the modulo for the file. Setting up his computer to run all night generating keys that hashed to a single group. Setting up his email server to automatically do a wget on the confirmation URL on received emails (hence getting around the “Confirm email address” emails).

The next day he runs a script to sign-up all the usernames gradually over the day. After they have all been signed up, Harry now simply scripts a few “Check username availability” calls for his last username generated to start his Denial of Service attack. Essentially, he has taken the non-matching lookup performance of the hash file from O(1 + k/n) to O(k) (where k is the number of keys and n is the modulo). Even worse than that, because of how level 1 overflows work, it now requires multiple disk reads as well (UniData only I believe). Continual random access to that file that is heavily weighted in one group is O(k^2)

Now, to give you a visual example, I have run a test on my home machine and produced 2 graphs.

Test specs:

CPU: Core Duo T7250 (2.0GHZ)
OS: Vista SP2 (32-bit)
DB: UniData 7.2 PE (Built 3771)
Hash File: Modulo 4013 – Type 0

The test:
Pre-generate 2 sets of numbers. One is of sequential keys, the other is of keys chosen because they all hash to a single group. Timings are recorded for the total time in milliseconds for:

  1. Write null records for all the keys and
  2. read in all the records.

Separate timings for sequential and chosen keys are taken. The test is repeated for different key counts from 1000 to 59000 in 1000 increments.

DOSAC UniBasic Code

First Graph – Sequential key timings by themselves:

Sequential Timings Only

Second Graph – Chosen key alongside sequential key timings:

Sequential and Chosen Timings

Naturally, timings are rough, but they are accurate enough to paint the picture.

Actually, now that I’ve mentioned painting…

Have you heard of Schlemiel the Painter?

Schlemiel gets a job as a street painter, painting the dotted lines down the middle of the road. On the first day he takes a can of paint out to the road and finishes 300 yards of the road. “That’s pretty good!” says his boss, “you’re a fast worker!” and pays him a kopeck.

The next day Schlemiel only gets 150 yards done. “Well, that’s not nearly as good as yesterday, but you’re still a fast worker. 150 yards is respectable,” and pays him a kopeck.

The next day Schlemiel paints 30 yards of the road. “Only 30!” shouts his boss. “That’s unacceptable! On the first day you did ten times that much work! What’s going on?”

“I can’t help it,” says Schlemiel. “Every day I get farther and farther away from the paint can!”

(Credit: Joel Spolsky, 2001)

When looking at Dynamic arrays in U2, you should see how they can be exactly like a computerised version of Schlemiel the Painter. In fact, a public article on PickWiki pointed this out quite some time ago. UniData is affect more so than UniVerse, in that UniVerse has an internal hint mechanism for attributes. The problem with this is, if an uncontrolled (eg, external) entity has control over the number of items in the dynamic array, you could be vulnerable to a Denial of Service attack. It could even be unintentional.

So, let’s see what all fuss is about. Firstly, a quick recap on the issue with dynamic arrays.

Essentially when doing an operation like “CRT STRING” it has to scan the string character by character counting attribute, multi-value and sub-value marks as it does. If you increment Y or Z (or X in UniData’s case) and do the same operation, it has to re-scan the string from the start all over again. As the number of elements increases, the more noticeable the flaw in this method becomes. In fact, cycling through each element in this manner is an O(k^2) algorithm.

I’ve seen this issue bite and bite hard. It involved 2 slightly broken programs finding just the right (or wrong) timing.

The first program was a record lock monitoring program. It used GETREADU() UniBasic command, after which it looped over every entry and generated a report on all locks over 10 minutes old. This process was automatically scheduled to run at regular intervals. It had been operating for months without issues.

The second program was a once off update program. Basically, it read each record in a large file, locked it then if certain complex conditions were met, it updated an attribute and moved on to the next record. See the problem? It didn’t release a record if it didn’t need updating. The processing was estimated to take about 30 minutes and as it turns out, not many records met the complex conditions.

See the bigger problem now? Yup, that’s right, the dynamic array returned by GETREADU() was astronomical! This resulting in the monitoring program saturating a CPU core. The same core the update program was running on. Uh oh! System performance issues ensured until the culprit was found and dealt with.

So, what do we do about these issues? You want a stable system right? One that is less easy to bring to its knees by malicious users and unfortunate timings of buggy code?

Hashed files:

DO NOT use external input as record keys! Place it in attribute 1, build a D-type dictionary and index it if you need, but not use it as the @ID!

A further option would be to have Hash files and their hashing algorithms updated to be able to deal with this type of malicious edge case. Other languages have (take Perl for example) updated their hash tables now to use hashing algorithms to be seeded at run-time. These means you cannot prepare ‘attack’ keys ahead of time and cannot replicate how the hashing works on another computer, since the hash algorithm will be seeded differently. Obviously, this cannot be done exactly the same with Hash files, is they are a persistent data store. It could however be done on each CREATE.FILE. That way, even if a malicious party can determine the modulo of a file, they be able to duplicate it on their system as each file will be seeded differently. Doing this would bring UniData and UniVerse inline with the security improvements made in other modern stacks.

Dynamic arrays:

This one is simple. Use REMOVE, don’t use simple FOR loops. Think through your data and were it is being sourced from. Is it from external entities? Is it from internal entities whose behaviour cannot be guaranteed to remain within safe bounds? If the answer to either of those questions is even a ‘Maybe’, stay safe and use REMOVE.

Security is not Obscurity. Even in U2 [Part 1]

December 6, 2009 2 comments

Sure, we may benefit from whatever shelter is derived from running a less widely used/understood system, but relying solely on security through obscurity with your U2 system is, to put it nicely, extremely naive in this day and age. It just doesn’t add up when you pay thousands of dollars for firewalls and other network security paraphernalia and wouldn’t dream of allowing raw user input through in your SQL-based applications.

U2 may have different syntactical spices and a different method of representing data than its mainstream counterparts but the core principles behind secure coding practices still apply:

The list goes on.

So, what specific vulnerabilities should we look out for?

Let us start with the humble EXECUTE/PERFORM statements in UniData and UniVerse. SQL Injection is a widely know subject, but how many U2 developers have considered UniQuery/RetrieVe Injection? Did you know that in some cases, malformed UniQuery in a EXECUTE can drop you to ECL?

As developers in the U2 world, the same lessons learnt in SQL Injection can and should be applied when using *EXECUTE/*PERFORM, etc. Sanitise your input!

Do you have any statements that work like this?


In this case, TAINTED.INPUT is either supplied by a user or is comes from an external source. The results from the SELECT statement are now compromised and can contain any data. Take for instance the following input for this contrived example:

" OR WITH CC.NUMBER = "4657000000000000

Essentially, this converts the innocent SELECT statement which, for example, was used to search for customer’s first names to get contact numbers, into one which can be used to find Credit Card numbers (hopefully though, your CC numbers are encrypted in some manner). Even worse, if your program displays error messages that reveal record names when they cannot be read, then an attacker with patience can reveal almost any data they want from your system.

Remember, in UniData you can use the ‘USING’ keyword to specify any FILE to source the dictionaries (UniVerse does not have this, I believe). Aside the all the usual manipulation of results, the USING means that if someone can control the first few lines of a record (temp data dumping file anyone?) then using the SUBR() call, they can even cause programs and subroutines to be called!

Before you EVER use input from a user or an external source, make sure it is validated and sanitised. Expecting a number? Use MATCH and 1N0N. Expecting a name? Make sure it is doesn’t contain double quotes. Don’t want to allow ‘searching’ with your SELECT? For example, I use the following check to ensure the user input doesn’t escape the SELECT string with double quotes or attempt a wildcard search with [ or ].


Further to this UniQuery/RetrieVe Injection vulnerability, earlier I mentioned that in certain situations you could cause it to crash to ECL.

Are you running UniData in PICK mode? If you are, I suggest you type ‘UDT.OPTIONS’  at ECL right now and continue down until you see whether ’41  U_UDT_SERVER’ is set to ON or OFF. Now, did it say OFF? If so, then read on because you may be vulnerable.

While that option is turned off, certain malformed UniQuery statements can cause you to crash straight to ECL, even if you are in a program called by a program.

Lets see an example. First, compile the following program in Pick Mode.

CRT "Enter the program to select"
CRT "Executing query..."
CRT "Checking results..."
   CRT "Program FOUND!"
   CRT "Program doesn't exist"

Now, when you run the program (I called it CRASHTEST), put a record ID that exists in BP. Try again with a record ID that doesn’t exist. Your results should be something like this:

Normal Program Operation

Normal Program Operation

Looks good. Program works as expected. Underneath this simple program though, lies a bug feature in the Pick Parser. To show this, I will use a specifically formed input that will make the programs SELECT malformed in a manner to gain ECL access. This time when you run the program (with UDT.OPTIONS 41 off) type in this input, including quotes:
" @ID="
In this case, the program will crash out before ever returning.

Crashed from EXECUTE

Crashed from EXECUTE

There are 2 ways to deal with this. The first way is to set UDT.OPTIONS 41 to ON. This will result in the EXECUTE returning so we can handle it whatever way we wish.

The other way is to set ON.ABORT. I created a VOC paragraph for this, called EXCEPTION as follows:

DISPLAY This program aborted

Running the same test above now results in:

Aborted program

Aborted program

Personally, I believe the method that returns control to the program (UDT.OPTIONS 41) and handles it accordingly (always check what was set by RETURNING) is the safest since it doesn’t give away that the program has a compromised EXECUTE statement. However, this may not always be a viable option, so make sure you at least have ON.ABORT set.

%d bloggers like this: