Archive

Posts Tagged ‘Bugs’

The problem with numbers

October 11, 2010 4 comments

UniData, like other weakly-typed systems, makes some programming tasks easier by not needing the developer to declare and adhere to a data type with variables. The general pros and cons of this have been debated many times across many languages and hence will not discussed here. What will be discussed is specific cases where this can cause you unexpected headaches.

A posting was made on the u2ug forums by Koglan Naicker about some unexpected issues when finding duplicates in a data-set.

In short, he found that when strings contained large numbers, it would sometimes incorrectly evaluate two different strings as equal. For example:

IF '360091600172131297' EQ '360091600172131299' THEN CRT "Equal"

The above code results in “Equal” being displayed on the screen. This is caused by a combination of 2 factors.

The first being that UniData is weakly typed. This means that it does not explicitly distinguish between strings and numbers, but attempts to determine the data type by examining the data. In this case, since the strings are numeric, it automatically treats them as numbers.

The second part of this issue is because now that it is treating those 2 strings as numbers, it needs to handle them in an appropriate data type on the CPU. Since the 2 strings are too large to be treated as an integer, they get converted to a floating-point number. Due to rounding that occurs, this actually results in both of these strings being converted to the same float-point representation! A better method may have been to use something such as Bignum instead of converted to floating-point. There would be a speed trade-off, but surely that would have been better than potentially incorrect programs.

Some people suggest prefixing or appending a non-number character to each string to force them to be treated as a string. Not entirely elegant and can have performance implications. Fortunately, UniData does have proper functions to handle these situations. In the case where you will be comparing strings that may consist of only numeric characters, you should use the SCMP function. This function compares two strings as strings, regardless of the actual data in them. Using this when you need to be certain how the comparison is performed can save you a lot of headaches in the future.

Also of interest is that this issue doesn’t just apply to UniBasic, but can also affect UniQuery!

It should be noted though, this only affects UniQuery when the dictionary item is right-aligned with the format field (eg, 20R in attribute 5).

You can tested this by creating a file and creating 3 records with the @ID of ‘360091600172130474’, ‘360091600172131297’ and ‘360091600172131299’.

Now, select upon the file where the @ID = ‘360091600172131297″ and you can see that 2 records are returned!

Results of selection

Non Unique Select

When explicitly selected a record via a unique key, this isn’t the result a database should return.

So, when dealing with large, potentially numeric fields with UniQuery, you may need 2 dictionary items. A left-aligned one for selecting on and a right-aligned one if you require numerical sorting.

Crouching Null, Hidden Bug

May 2, 2010 1 comment

Null, (actually just an empty string “” in U2) is a valid value. Normally, it would be treated exactly the same as other normal values, such as 1 or “1”, but it isn’t always.

I’ve seen a few bugs that have been created by not understand the differences in how nulls are treated. When debugging the code can look completely valid as well, meaning it takes even longer to identify and rectify the issue.

Okay, lets see how you go. I’ll give you a few series of records, each created with the same data. Your job, is to work out which records in the series will exactly match their ‘CONTROL.REC’. Good luck and try to do it without needing to compile the code!

Series 1:

 CONTROL.REC = "" : @AM : "A"

 DIM REC(3)

 REC(0) = ""
 REC(0)<2> = "A"

 REC(1) = ""
 REC(1)<-1> = "A"

 REC(2) = ""
 INS "A" BEFORE REC(2)<2>

 REC(3) = "A"
 INS "" BEFORE REC(3)<1>


Series 2:

 CONTROL.REC = "A" : @AM : ""

 DIM REC(3)

 REC(0) = "A"
 REC(0)<2> = ""

 REC(1) = "A"
 REC(1)<-1> = ""

 REC(2) = "A"
 INS "" BEFORE REC(2)<2>

 REC(3) = ""
 INS "A" BEFORE REC(3)<1>


Series 3:

 CONTROL.REC = "" : @AM : "A" : @AM : ""

 DIM REC(3)

 REC(0) = ""
 REC(0)<2> = "A"
 REC(0)<3> = ""

 REC(1) = ""
 REC(1)<-1> = "A"
 REC(1)<-1> = ""

 REC(2) = ""
 INS "A" BEFORE REC(2)<1>
 INS "" BEFORE REC(2)<1>

 REC(3) = ""
 INS "A" BEFORE REC(3)<2>
 INS "" BEFORE REC(3)<3>


Series 4:

 CONTROL.REC = "A" : @AM : "" : @AM : "A"

 DIM REC(3)

 REC(0) = "A"
 REC(0)<2> = ""
 REC(0)<3> = "A"

 REC(1) = "A"
 REC(1)<-1> = ""
 REC(1)<-1> = "A"

 REC(2) = "A"
 INS "" BEFORE REC(2)<1>
 INS "A" BEFORE REC(2)<1>

 REC(3) = "A"
 INS "" BEFORE REC(3)<2>
 INS "A" BEFORE REC(3)<3>


Series 5:

 CONTROL.REC = "A" : @AM : "" : @AM : ""

 DIM REC(4)

 REC(0) = "A"
 REC(0)<2> = ""
 REC(0)<3> = ""

 REC(1) = "A"
 REC(1)<-1> = ""
 REC(1)<-1> = ""

 REC(2) = "A"
 INS "" BEFORE REC(2)<2>
 INS "" BEFORE REC(2)<2>

 REC(3) = "A"
 INS "" BEFORE REC(3)<2>
 INS "" BEFORE REC(3)<3>
 
 REC(4) = ""
 INS "" BEFORE REC(4)<1>
 INS "A" BEFORE REC(4)<1>


Series 6:

 CONTROL.REC = "A" : @AM : "B" : @AM : "C"

 DIM REC(4)

 REC(0) = "A"
 REC(0)<2> = "B"
 REC(0)<3> = "C"

 REC(1) = "A"
 REC(1)<-1> = "B"
 REC(1)<-1> = "C"

 REC(2) = "A"
 INS "C" BEFORE REC(2)<2>
 INS "B" BEFORE REC(2)<2>

 REC(3) = "A"
 INS "B" BEFORE REC(3)<2>
 INS "C" BEFORE REC(3)<3>
 
 REC(4) = "C"
 INS "B" BEFORE REC(4)<1>
 INS "A" BEFORE REC(4)<1>


Did you get them all? I’ve put the answers as a comment so you can check them if you want. I’d be surprised if you got them all right…

So, what should you take away from this? <-1> and INS can give you (or others!) a world of headaches if you don’t understand their peculiarities with null values.

Final Note: In UniVerse INS behaviour for some cases is dependant on the flavour your account is running in and the $OPTIONS EXTRA.DELIM setting.

%d bloggers like this: