Have you ever heard of The Joel Test? If not, go read it now.
Have you read it yet? No? Go on, read it!
Okay, welcome back!
What I’m curious about here is how well the U2 (even the wider MV) community fairs along these lines. Support for source control is a bit behind and from talking with people in some other U2 shops, the pervasion of real modern tools isn’t too crash hot. I’d be really surprised if any of us get greater than 9 but equally surprised if we were below 4 for anyone with more than 4 developers.
So let’s hear it. Give your honest answer (don’t sugar coat it!).
Although your answer will be anonymous (feel free to elaborate your scores in a comment however!), I’ll kick it off by giving the results as I see it for where I am now.
We scored 5 out of 12. Not completely horrible, but not ideal either. I’ll post the blow-by-blow break down in the comment section for those that are interested.
In the last post I suggested that each piece of information in a file record needed an associated dictionary item.
Some may look at their files and realise it just cannot be done. In that case, “you’re doing it wrong”.
Common case: You have a file that logs transactions of some sort. For each transaction, it just appends it to the record, creating a new attribute.
There are several issues with this style of record structure.
Firstly. You cannot create dictionary items to reference any information (except of course, unless you create subroutine and call it from the dictionary). For example, if each transaction has a time-stamp, you cannot use UniQuery/RetrieVe to select all records with a certain time-stamp.
Secondly, any time you read in the record and need to count how many transactions are in the record, it needs to parse the entire record. Now, if you have each bit of information in a record stored in its own attribute (say time-stamp in , amount in , etc) it would only need to parse the first attribute, potentially cutting down on the CPU expense greatly.
So, if you must store some sort of transaction/log style data in a U2 record, please reconsider the traditional approach of appending the whole transaction to the end and take a more U2 perspective by splitting each bit of information into its own attribute. This way, it will be much easier to use U2’s inbuilt features when manipulating and reporting on your data.
Something that often gets overlooked in the U2 world is best practice regarding dictionaries.
Before I get into it however, a very brief introduction to dictionaries for those who are new to UniVerse and UniData.
SQL databases have a schema which defines what data can be stored and where it is stored in relation to the rest of the data. This means every bit of data has a related section in the schema with gives it a name and a type.
UniVerse and UniData do not do this. The schema (dictionary) is simply there to describe the data (as opposed to define). You can give sections of the data arbitrary names and/or data types. In fact, you can give the same location multiple names and types, or even create a dictionary item that describes multiple other sections! Each file uses another file called a dictionary to hold its ‘schema’ (Which, for the rest of this post, will no longer be called a schema since it is misleading).
According to the UniData “Using UniData” manual, it describes a dictionary as containing “a set of records that define the structure of the records in the data file, called D-type records”. Now, it is very important to remember this next point: The manual is at best overly optimistic and at worst flat-out lying.
In SQL (excluding implementations such as SQLITE), if you get a table schema and it informs you that the third column is an INTEGER and it was called ‘Age’, then it would be safe to assume it was what it said it was. In the worst case, you can be certain it won’t ever return a value of “Fred”. In UniVerse and UniData, the dictionary doesn’t even need to contain a record to describe the third attribute (an attribute is sort of like a column, but different).
Also of note to new players is that D-type records are not the only records in a dictionary file. There are 3 other types of records to consider. Once again, straight from the manual: ‘A dictionary may also contain phrases, called PH-type records, and items that calculate or manipulate data, called virtual fields, or V-type records. A user may also define a dictionary item to store user-defined data, called X-type records’.
What does this mean for you? Well, like most of U2, when looking at the records in a dictionary file, anything goes. Some could be accurately describing the file structure, while others could be getting fields from sections in a completely different file. Some again could have nothing to do with the data in the file at all and are merely there because a programmer has used it as a convenient dumping ground. Also to consider is the item being completely wrong.
There are 2 sides to this. 1) It can make development faster as you can just tack on extra bits of data with no maintenance work required. 2) As a result of 1, you can quickly find systems in a state where your records have mystery data and you cannot even begin to work it out without scouring through many programs and manually inspecting the data.
Even more confusing, is that you can have multiple records referring to the exact same location but describing the data differently.
This is where best practices come in. Here are several simple rules, that if followed, should go a long way to ensure your dictionary files are useful, accurate and easier to maintain.
- If you ever add new data to a record, then you MUST create at a minimum, a D-type record to describe each piece of data.
- Always check that an appropriate dictionary item doesn’t already exist before creating a new one to reference a section of data in an existing file
- If you come across a missing dictionary item, don’t ignore. Either create it or add it to whatever bug-tracking system you use.
- Remember, after the type in attribute 1, you can write anything. Use this to describe what the data is if the name isn’t sufficiently self descriptive.
- Also, if the data is effectively a foreign key for another file, use the end of attribute 1 to mark it as such (including the main file it references).
- Use the User-type record ability to add a single record that describes the general purpose/usage of the overall file. Give it a simple/recognisable name like README or FILEINFO