Tuesday, April 28, 2015

Merging Access MDB files

Recently I came across a task that seemed trivial, but turned out into an adventure on a second look. I needed to merge 33 MDB files having one or two tables with same structure each. I was disappointed that Access has no standard means to do so I also could not use Excel as the data could have been spoiled by its auto-type function (will they ever fix that?!).

So I started creating a program that would do the task. The fact that tables can be named differently (thanks for a silly decision of a person who created them in the first place) I faced unexpected complexity. The following query can be used to get the list of tables in Access MDB:

SELECT Name FROM MSysObjects WHERE Type=1 AND Flags=0

However it does not work due to permission schema if executed by a program. Running it under administrator does not work either (famous Microsoft integration of solutions is to be blamed). SO I needed to grant the required access with another query before above query starts working:

GRANT SELECT ON TABLE MSysObjects TO PUBLIC

Note: Just hold on for a second and think how stupid it seems. I can't do a query but I can run another one to grant the required access and voila it does the magic. So what is the point of restriction in the first place? Ah?...

This is the code that does the job:

var conn = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + file + ";" +
@"Jet OLEDB:System Database=C:\Documents and Settings\\Application Data\\Microsoft\Access\System.MDW");

var cmdToExecute = new OleDbCommand();
cmdToExecute.Connection = conn;
cmdToExecute.CommandType = System.Data.CommandType.Text;
cmdToExecute.CommandText = "GRANT SELECT ON TABLE MSysObjects TO PUBLIC";

conn.Open();

// set permissions to read system object to get the list of tables
try
{
cmdToExecute.ExecuteNonQuery();
}
catch(Exception e)
{
Console.WriteLine(e.Message);
return;
}

Note: You must reference a schema file otherwise you will an exception during query execution.

And the second step - getting the list of tables:

cmdToExecute.CommandText = "SELECT Name FROM MSysObjects WHERE Type=1 AND Flags=0";
try
{
var reader = cmdToExecute.ExecuteReader();
while(reader.Read())
{                        
tables.Add(reader[0].ToString());
}
reader.Close();
}
catch()...

Then for each table selected I ran the query to read the content and added it to a StringBuilder object:


cmdToExecute.CommandText = "SELECT * FROM " + table;
reader = cmdToExecute.ExecuteReader();
while(reader.Read())
{
var sb = new StringBuilder();

for(int i=0; i
{
sb.Append(reader.GetValue(i).ToString());
sb.Append(delim);
}
sb.Append(file);
sb.Append(delim);
sb.Append(table);
sb.Append("\n");
}


This is pretty much it. The only other thing is that I faced some corruption within the files scanned and needed to use Access to fix the database before it starts working for those.

Feel free to copy and modify the code as you like. Hope it helps.

Full solution code is provided below:

static void Main(string[] args)
{
// read folder
var files = Directory.GetFiles(inDir, "*.mdb", SearchOption.TopDirectoryOnly);

// prepare to writing results into a file

using (var stream = new StreamWriter(outDir + @"\alldata.txt"))
{
// write the header
foreach(var header in headers)
{
stream.Write(header + delim);
}
stream.Write("\n");

// for each MDB file, read it and save content into CSV
foreach(var file in files)
{
Console.WriteLine(file);

var conn = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + file + ";" +
@"Jet OLEDB:System Database=C:\Documents and Settings\\Application Data\\Microsoft\Access\System.MDW");

var cmdToExecute = new OleDbCommand();
cmdToExecute.Connection = conn;
cmdToExecute.CommandType = System.Data.CommandType.Text;
cmdToExecute.CommandText = "GRANT SELECT ON TABLE MSysObjects TO PUBLIC";

conn.Open();

// set permissions to read system object to get the list of tables
try
{
cmdToExecute.ExecuteNonQuery();
}
catch(Exception e)
{
Console.WriteLine(e.Message);
return;
}

var tables = new List();

// get data
cmdToExecute.CommandText = "SELECT Name FROM MSysObjects WHERE Type=1 AND Flags=0";
try
{
var reader = cmdToExecute.ExecuteReader();
while(reader.Read())
{                        
tables.Add(reader[0].ToString());
}
reader.Close();
// reading data for each table
foreach(var table in tables)
{
cmdToExecute.CommandText = "SELECT * FROM " + table;
reader = cmdToExecute.ExecuteReader();
while(reader.Read())
{
var sb = new StringBuilder();

for(int i=0; i
{
sb.Append(reader.GetValue(i).ToString());
sb.Append(delim);
}
sb.Append(file);
sb.Append(delim);
sb.Append(table);
sb.Append("\n");

stream.Write(sb.ToString());
}
}
}
catch(Exception e)
{
Console.WriteLine(e.Message);
return;
}

conn.Close();
}

}

Console.ReadKey();
}