Archive

Archive for the ‘Database’ Category

Update to Installing UniVerse on CentOS post

I’ve been asked a few times on how to get the U2 DBTools products, like XAdmin to connect to their VirtualBox machine. I obviously left something out in the article, so I’ve went back and updated the post for how to configure your firewall. You can follow these same steps for UniData as well.

Just to make sure you don’t miss it, I decided I should also duplicate it here…

Configure firewall

If you want to access UniVerse (say using XAdmin from the U2 DBTools package), you will need to modify your iptables configuration.

First, in my case I have the VirtualBox network adapter set to ‘Bridged’. Now, in a shell window update iptables ‘sudo vi /etc/sysconfig/iptables’

In vi, before any LOG or REJECT lines, add ‘-A INPUT -m state –state NEW -m tcp -p tcp –dport 31438 -j ACCEPT’.

Once that is done, you simple run ‘service iptables restart’ to pick up the changes.

The updated iptables file

The updated iptables file

XAdmin once connected

XAdmin once connected

Installing UniVerse on CentOS 6.2

July 8, 2012 1 comment

Previously we looked at installing UniData on a Linux machine. This time around we are going to install UniVerse on a CentOS 6.2. I’ve chosen CentOS as it is essentially a re-branded (de-branded?) version Red Hat Enterprise Linux. RHEL is officially supported by Rocket Software, making CentOS a great free OS for playing around with U2 databases.

As always, I suggest you do this in a Virtual Machine so that you can create as many dedicated test systems as your heart desires (or storage limits). For this I’ve used Oracle’s Virtual Box which is available for free.

Requirements

Okay, so to start, let’s make sure we have everything we need to do this:

  1. Suggested: Dual Core CPU or better (particularly if running as a VM)
  2. Suggested: 2GB RAM or better (particularly if running as a VM)
  3. Virtual Box software
  4. Latest CentOS LiveCD/LiveDVD ISO (as of 2012/06/07, version 6.2)
  5. UniVerse Personal Edition for Linux

Preparing the VM

After you have installed Virtual Box and have it running, we will need to create a new image to run CentOS. Doing this is as simple as clicking the ‘New’ button and follow the prompts.

Create a new Virtual Machine

Most questions can be left as is, except for the operating system. For the operating system, set it to ‘Linux’ with version ‘Red Hat’.

My old laptop has 2GB of RAM, so I’m assigning 1GB to this machine image.

Setting the Virtual Machine's  memory

I stick with dynamic allocation of my disks for most testing as it is easier to move the smaller images around. For more serious work, you might be better served creating a fixed disk size as it generally performs better.

Selecting the disk type

The default 8GB disk is just fine. You can always create and add more disks later.

Now that you have your machine image ready, select the image and click on the settings button. In this screen click on the storage option and select the DVD drive from the IDE Controller. On the right-side there is a small CD/DVD image you can click on then select the option that let’s you choose a CD/DVD image. This will let you select the CentOS ISO you downloaded so we can boot from it.

Virtual Machine settings

While in the settings screen, you should also add a shared folder and click on the read-only and auto-mount checkbox options.

Installing CentOS

CentOS-UniVerse [Running]

If you are not installing this as a virtual machine, you can burn the ISO image to CD/DVD and start the machine with the CD/DVD in the drive (or on modern machines, via USB drive). Only do this is you know what you are doing or are intending to have CentOS as the sole operating system. From here on in, I’ll be assuming you are taking the VM route.

Select the VM image and click on the start button.

CentOS should auto-boot from the CD/DVD image. Once it has loaded and is sitting at the desktop, there is an ‘Install to Hard Drive’ option. Click on this and follow the installation instructions CentOS provides you. Generally speaking, the default options are the ones you want.

Early on in the installation CentOS will issue a ‘Storage Device Warning’. This is for your newly created 8GB disk. In this case you can select ‘Yes, discard any data’. Warning: If you are not doing this in a VM, you must know what you are doing or you risk losing data.

Storage Device Warning

Where it asks you for hostname, you can leave it as the default. I’ve taken to naming them {OS}.{DB} in lowercase; so in this case I’m naming it ‘centos.universe’.

Once the installation is finished, you can restart the VM image. Be sure to remove the CD/DVD image so that it boots from the hard drive. It will ask you a few final questions once it restarts (such as entering a non-root user) before it takes you to the login prompt.

To make our life easier, once we have logged in, we will add ourselves to the list of allowed sudoers. To do this, open a terminal window by selecting Applications -> System Tools -> Terminal. I also added this shortcut to the desktop since I use it so much.

In the terminal, switch to the root user by running ‘su -‘. We can now edit the list of sudoers using visudo. At the end of the file, add ‘{user} ALL=(ALL) ALL’ where {user} is the username you created for yourself earlier.

Now is a good time to shut down and take a back-up of the image so you can clone as many freshly minted VM’s as you want. I also try to do some common tasks such as installing/updating gcc (terminal: ‘sudo yum install gcc’), installing Google Chrome (http://google.com/chrome) and ant (terminal: ‘sudo yum install ant’) first.

Installing UniVerse

Download UniVerse Personal Edition inside your VM image and place it into a temporary directory.

While you are waiting for it to download, you can create the ‘uvsql’ user we will require later. From the ‘System’ menu, select ‘Administration’ -> ‘Users and Groups’. Once you have the program up, click on the ‘Add User’ button, then fill in the username and password fields. Click ok and exit out of the user manager.

Create uvsql user

Open up a terminal window then change to your temporary directory where the UniVerse download is located. The first step will be to extract everything from the compressed file; to do this you can type in ‘unzip UVPE_RHLNXENTINT_11.1.9.zip’. Replace ‘UVPE_RHLNXENTINT_11.1.9.zip’ with whatever the downloaded filename is in your case.

The next step will be to extract the uv.load script to install UniVerse. To complete this step, run this command to extract it from the STARTUP archive: ‘cpio -ivcBdum uv.load < ./STARTUP'

You can now run uv.load as root with the following command: 'sudo ./uv.load'. Select 1 on the first prompt to install UniVerse with 'root' as the default owner. This is okay as we are just building a dev system.

On the next screen select option 4 to change the 'Install Media Path' to whatever the path of the temporary location you extracted UniVerse into. In my case it was '/home/itcmcgrath/temp'. The rest of the options are okay being left set to the defaults. Press [Enter] to continue with the installation process.

UniVerse will now be installed then put you into an administrative program. [Esc] out of this to drop to a UniVerse prompt then type in 'QUIT' to drop back to the command line.

There you have it, a working UniVerse server running in a Virtual Machine. Shutdown your VM and take a copy of the machine image so you have a fresh copy of UniVerse in easy reach.

Installation completed

Update: Configure firewall

If you want to access UniVerse (say using XAdmin from the U2 DBTools package), you will need to modify your iptables configuration.

First, in my case I have the VirtualBox network adapter set to ‘Bridged’. Now, in a shell windows update iptables ‘sudo vi /etc/sysconfig/iptables’

In vi, both any LOG or REJECT lines, add ‘-A INPUT -m state –state NEW -m tcp -p tcp –dport 31438 -j ACCEPT’.

Once that is done, you simple run ‘service iptables restart’ to pick up the changes.

The updated iptables file

The updated iptables file

XAdmin once connected

XAdmin once connected

Disclaimer: This does not create a UniVerse server that will be appropriate to run as a production server.

Installing UniData on Fedora 14

May 30, 2011 5 comments

For some future upcoming posts, I needed to install UniData on a Linux Machine.

Since I’m already going through the effort of freshly installing both Fedora and UniData, I thought I would share required steps so anyone else who wanted to create a similar test system can do so just as easily. It turns out to be quite simple and straight forward, with only minor set up tasks along the way.

Firstly, I suggest you do this in a Virtual Machine so that you can create as many dedicated test systems as your heart desires (or storage limits). For this I’ve used Sun’s Oracle’s Virtual Box which is available for free. To make it easier, I’ve also got instructions for the few extra preparation steps you will need to do the Fedora installation in Virtual Box

Requirements

Okay, so to start, let’s make sure we have everything we need to do this:

  1. Suggested: Dual Core CPU or better (particularly if running as a VM)
  2. Suggested: 1GB RAM or better (particularly if running as a VM)
  3. Virtual Box software
  4. Fedora 14 ISO
  5. UniData Personal Edition for Linux

Preparing the VM

After you have installed Virtual Box and have it running, we will need to create a new image to run Fedora. Doing this is as simple as clicking the ‘New’ button and follow the prompts. Most questions can be left as is, except for the operating system. For the operating system, set it to ‘Linux’ with version ‘Fedora’.

The default 8GB Dynamic disk is just fine. You can always create and add more disks later.

Now that you have your machine image ready, select the image and click on the settings button. In this screen click on the storage option and select the DVD drive from the IDE Controller. On the right-side there is a small CD/DVD image you can click on. This will let you select the Fedora 14 ISO you downloaded so that it will boot from it.

While in the settings screen, you should also add a shared folder and click on the read-only and auto-mount checkbox options.

Installing Fedora

Fedora 14 VM for UniData

Fedora 14 VM for UniData


If you are not installing this as a virtual machine, you can burn the ISO image to CD/DVD and start the machine with the CD/DVD in the drive. Only do this is you know what you are doing or are intending to have the Fedora as the sole operating system.

If you are installing this as a virtual machine, select the VM image and click on the start button.

Fedora should auto-boot from the Fedora image. Once it has loaded and is sitting at the desktop, there is an ‘Install to Hard-Disk option’. Click on this and simply follow the installation instructions Fedora provides

Installing UniData

Before you can install UniData on Fedora 14, you must first install the libgdbm.so.2 library. You can download and install the RPM for libgbdm.so.2 here

Apart from the above missing dependency, it is as simple as following the installation manual provided by Rocket Software.

The only other point of note from the initial installation is that not all the escape characters in udtinstall are processed correctly, so expect to see a few lines like “\tWould you like to continue?”

Now you will need to set up the environment variables you will need. To do this, ensure you are in a shell as root or that you run these commands as root. Change to the /etc/profile.d directory. In here we are going to create a unidata.sh file that will contain all the environment variables UniData requires.

Just type in ‘gedit unidata.sh &’ to bring up a text editor (or just use vi/emacs) to paste the following into:

   UDTHOME=/usr/ud72 ; export UDTHOME
   UDTBIN=$UDTHOME/bin ; export UDTBIN
   PATH=$PATH:$UDTBIN ; export PATH
   LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$UDTBIN ; export LD_LIBRARY_PATH
   UDTERRLOG_LEVEL=2 ; export UDTERRLOG_LEVEL

Restart the machine or run the new script as root and you should be able to run ‘startud’ as root. If UniData boots up correctly, open a non-root shell and type in ‘cd $UDTHOME/demo’ then ‘udt’ and you should successfully jump into ECL.

There you have it, a working UniData server running in a Virtual Machine

Disclaimer: This does not create a UniData server that will be appropriate to run as a production server.

Categories: Database Tags: , , ,

Data Integrity

February 27, 2011 Leave a comment

One of the features not present in UniData that you many have become used to in the world of SQL is referential integrity.

Data is one of the most valuable assets of a company. If only for this reason alone, it should be treated with the utmost respect and professional care. Everybody knows that backing up data is essential, but what data are you backing up?

If the data is already corrupt you’re in a whole world of hurt. How long has it been it corrupt? Has it corrupted other data? Has it already impacted the business and to what extent? You can’t just restore from several months ago. You have to spend the time manually working out what went wrong, how to fix and potentially trawling through backups to find data to reinstate.

Here I should clarify exactly what I’m referring to by ‘corrupt data’. I’m not talking about OS-level corruption; from here on I will be talking about 2 types of logical corruption:

Unlike the major databases (such as MSSQL, Oracle and MySQL) UniData and UniVerse do not have logical data integrity constraints supported in the database layer. This leaves it up to each individual application to ensure data integrity.

Anyone working with databases knows that bugs (both current and of old) can result in logical inconsistencies creeping into your data. The more denormalised your data, the higher the chance for this corruption.

Some of this corruption will become apparent immediately because a process will fail and require you to locate and fix both the cause of the corruption as well as the corruption itself. Surprisingly, these are not the ones you should be most worried about. The worst are the ones you don’t notice, because they don’t cause the system to visibly malfunction. These are the worst because they can fester in your system for years, silently corrupting data that is derived from it and potentially impacting business decisions. Soon the data itself will become much harder to repair since needed information may no longer be readily at hand. If/when these eventually cause a problem, it will be much harder and time-consuming to address, if even possible.

Since we have to handle logical data integrity entirely in the application layer, U2 databases are somewhat more susceptible to these issues from code bugs. To combat this, there are 2 methods I propose you adopt.

The first is a Data Integrity Audit (DIA) you can schedule regularly in production. This validates your data and reports on any inconsistencies it encounters. This helps you identify issues earlier and potentially help track down the programs/conditions that are causing the corruption. We have already implemented this system for ourselves and I’ll explain how we did it below.

The second method is based on the above DIA. Modifying it to run from file triggers, you can implemented a system to use while testing (Unit, System and at User Acceptance Testing) that can report exactly what program/line is writing the corrupted record as it happens. Catch it BEFORE it reaches production! However, I don’t recommend actually implementing this into production (at least, without great care/load testing) since it will have performance implications that may be unacceptable.


Implementing a solution

Alright, enough of the prelude. Lets talk about implementing a DIA program in to your system. It isn’t as hard as you might think and it can be set up incrementally so you can cover your most important data first.

The system has 4 parts to set up:

  1. Defining the Rules
  2. Storing the Rules
  3. Checking the Data
  4. Reporting on Violations

Defining the Rules

The first step is the logical rules that should be constraining your data. The rules will fall into 2 categories:

  • Referential integrity: Identify any attributes that are foreign keys (or lists of foreign keys)
  • Domain integrity: Specify the ‘domain’ of the field. This includes type (alpha, numeric, etc), enumerations, length, and if NULL is allowable.

Looking at a few of your key tables, you should be able to quickly identify some basic rules your data naturally should abide by. Write these down as these will be some easy rules to start testing.

Storing the Rules

The second step is determining how to store the rules. Although you can do this however you want, there are several reasons that make using the dictionary file ideal:

  • Placing the constraints in with the schema (both are structural metadata). Collocation is a good thing.
  • Attribute 1 can store anything after the type; it allows you to store the constraint directly with the section of the schema you are constraining!
  • X-Type Attributes allow you to use enumerations (part of domain integrity) while still keeping them defined in the schema, instead of elsewhere.
  • It allows you to easily test and/or enforce the constraints with the ‘CALCULATE’ function (more on this later)

So, how exactly do you store the constraints in with the dictionary records? Here is the following scheme we use:

TYPE [FKEY filename [PING f,v,s]] [MAND] [ENUM enum_item]

  • FKEY: Foreign key to ‘filename
  • PING: Checks for @ID in the foreign record location <f,v,s>
  • MAND: Value cannot be NULL
  • ENUM: Value must be an enumeration in the dictionary X-type record ‘enum_item

When attribute 6 of the dictionary item indicates that the data is a multivalued list, FKEY, MAND, ENUM and DATATYPE should adhere to it and treat the each item in the list separately. The only special case is MAND, which only causes a violation when a multivalue in the list is empty. That means it does not cause a violation when there is no list at all. If you want to cover this you can create another non multivalued dictionary item as well and apply the MAND rule to it.

Checking the Data

The third part is how you will test/enforce these constraints:

  • Production: A program, that given a filename, reads in the dictionary items and associated constraints. It can then test each record and report any violations. This would typically be run as part of a nightly job, and/or if you are set up for it, on a backup/restore of production onto a development machine.
  • Development: An update trigger subroutine that is only implemented on development. This also allows you to transparently test if new or modified code is corrupting your data before it even makes it into production. Although this would typically not be implemented into your actual production system due to performance impacts, there is no technical reason that it cannot be done if so desired (even just for selected files)

These methods are not mutually exclusive and are designed to cover different situations. The first is a post corruption check that allows you to identify issues faster than you normally would. The second allows you to provide better test coverage and reduce the risk of introducing faulty code into your production system.

Reporting the Violations

The fourth and final part of the system is how you report it.

There are many options you many want to consider depending on your needs and which of the 2 options above you are considering it for.

We decided upon a non-obtrusive option that allowed us to build either reports or select lists from the results. This method requires you to create a new file to store the results. For the sake of this article, let us call it DIA_RESULTS. You can clear this file just before running the DIA program, or performing tests if you are using the trigger method.

In DIA_RESULTS, each record should contain the following information:

  • Date failed
  • Time failed
  • Filename the violation was on
  • Key the violation was on
  • Dictionary item used when the violation occurred
  • Rule name the violation occurred on
  • The value that caused the violation (just in case it changes before you get to it)
  • If from a trigger, the current call stack

Using this information it is easy to print off reports, create select lists to get to the records and to determine exactly what was wrong in the data.

Optimising the set up of your UniData data [Part 1]

September 3, 2010 1 comment

One of the benefits with U2 Data servers is that it can be extremely quick to turn-around a new system. The unfortunate downside is that this makes it extremely easy to ignore the architecture of your system. This can lead to future system performance issues and harder to maintain programs.

Here I’ll be looking at the set up of your files and records (tables and columns for those still grasping UniData/UniVerse). Your system revolves around your data, so if you don’t get it right to start with it inevitably leads to a sub-optimal system. What I won’t be discussing here is the usual modulo/block-size related maintenance of your files; there is already literature in the manuals for this topic.

To start with, you should have already read my previous post about correctly setting up the layout of your files and the need to create all the relevant D-type dictionary items. With that in mind, I have a story for you…

This story is about Johnny and Alicia, who are both admin staff working for a sales company back in the 1930’s. Both have a large set of contracts that they store in folders in a filing cabinet.

Occasionally their managers will ask them to find a contract that is being handled by a certain sales rep. Although they hate this task, each time they manually search through the stack of contracts to retrieve it. Funnily enough, in the time it takes Johnny to find one, Alicia can usually find at least two.

Curiosity gets the better of Johnny who eventually asks Alicia how she was so fast.

“It’s easy, I have moved the page with the sale rep’s name to the front of the contract”

Dang! So simple! Johnny realised having to dig ten pages deep on each contract was so senseless!

Fortunately, admin staff can now use digital retrieval systems, so they don’t have to think about this sort of small detail any more. The need to pay attention to this detail hasn’t gone away though. Now it rests with us.

Not only should you ensure the layout of data is in the correct format, but you should also pay attention to the order of your data. It should be organised with the most frequently searched upon and utilised data earlier on in the record. Since the record fields are separated by delimiters, using and querying later attributes requires the engine to scan every character up until to the requested attribute to determine where it starts. By moving the most frequently used data to the being of a record, you reduce the amount of work required to initial find the data.

Here are some timings from a simple test run I performed on our system.

The setup: A file with modulo 10007, pre-filled with records keyed from 10000 to 99999. Attributes 1, 2, … up until 29 are each set to the key. I have created a D-type attribute for each one timed (D1, D2 & D29).

The test: Perform a select on the file with the attribute equal to a value (E.g. SELECT TIMINGS WITH D1=”12345″). Repeat this 1000 times for each attribute tested.

Results:

Timings for 1000 SELECTs

Timings

Data in <1>: 338655 (100.00%)
Data in <2>: 342134 (101.03%)

Data in <29>: 471811 (139.32%)

Even with these small records, you can see the difference you can achieve by having your data in the correct order. Scale this up to larger files with bigger records, more complex select statements combined with the processing of these records in your subroutine and it can provide a significant difference in the execution times across a system.

%d bloggers like this: