Home Page list
WebTrees Website - Pre-Processing Tools using SQLite in Visual Basic et al #webtrees #visualbasic
The following paragraphs are extracted from private messages from Tom, davistom1, who kindly agreed to their being brought onto a page for wider dissemination. He talks about the relationship between his RM4 database and media collection and two websites he operates, one a webtrees, the other GenWeb, both worth a visit. He developed some procedures using SQLite and Visual Basic that he runs against his RM database prior to exporting a GEDCOM for upload to the sites. Tom has made a deployable version of his RMSanity application along with its source code and it's available for download on this page. (page created by another Tom, ve3meo).
I long ago developed a SQLITE based preprocessing program which reads the RM4 database (or RM5 for that matter) and does a whole bunch of sanity tests, including locating any duplicate or unused places, place details, media files, etc. which I can then resolve directly in RM4 (or RM5). I do the preprocessing/cleanup before export to GEDCOM in preparation for importing the latter into a webtrees database for a public website which I operate.
The computer based parts of my genealogy activity center around a desktop RM4 database (I haven’t actually switched to RM5 yet because of some of the headaches associated with the media handling re-design) and two websites which I operate. One website is my own webtrees based family history portal located at http://fotgp.com
and the other is the Northampton County, North Carolina GenWeb site located at http://ncgenweb.us/northampton
, for which I am site coordinator. All of these activities share parts of a large, currently ~11Gb, media library. The media library file tree is organized with top level directories representing some 75 repositories, mostly public record repositories, with sub tree branches mimicking the repository filing system but with some ad hoc repository branches also. My RM4 database currently links into about 7500 items in this media library. I have a whole bunch of homebrew code used to manage this mess, and a good bit of it directly queries the RM4 SQLite database for an array of sanity testing and data transformation between the various end uses.
My webtrees site is updated frequently from the RM4 database by first sanity testing it, manually tracking down and correcting discrepancies in it, backing it up, exporting it to GEDCOM, processing the GEDCOM through a bunch of transformation steps and finally importing the transformed GEDCOM into the site database. I do not routinely edit the webtrees database via its web interface, though that is what webtrees and its predecessor, phpGedView, were designed to do. My personal genealogy data baseline is my RM4 desktop database.
The connection with the GenWeb site is primarily the latter’s Cemeteries and Burials, Photo Albums and Places portals, the homebrew MySQL database which these portals employ, their links into the previously described media library and the homebrew code used to update the MySQL database. The only link between these and RM4 SQLite is that they share the same master media library, so no more about them here.
I get into RM SQLite databases two ways: (1) for some purposes I use the SQLite Developer app but mostly, (2) I use some homebrew code with the System.Data.SQLite ADO.NET provider library. I use SQLite Developer mainly in developing and debugging queries. Trying to distill the logic of my various SQLite queries to something which is both (1) compatible with the focus of your wiki and (2) presentable is a significant undertaking. Let me elaborate a bit and then we can explore whether it would be useful.
The RM4 database sanity test noted above is illustrative. I have written a desktop app which uses the aforementioned System.Data.SQLite ADO.NET provider to test for and report discrepancies in the following: (1) broken media links, (2) non-privatized SSNs, (3) orphaned place details, (4) un-used place details, (5) un-used places, (6) un-used media items and (7) duplicate media items. The app also requires two SQLite extensions, a function which tests for existence of a media library file denoted by its path and a NOCASE collation substitute for the infamous RMNOCASE. Incidentally, I don’t ever write to the RM4 database outside the RM4 app.
SQLite Queries in RMSanity
Following are examples of the hard coded SQLite queries used for some of these tests:
-- Broken media links
SELECT m, MediaFileExists(m) AS e
SELECT mediapath || mediafile AS m
) AS mm
-- Non-Privatized SSNs
FROM EventTable AS et
JOIN FactTypeTable AS ft
WHERE ft.Name='Soc Sec No'
AND NOT et.IsPrivate=1;
-- Orphaned place details
SELECT pd.*, p.Name
FROM PlaceTable AS pd
LEFT OUTER JOIN PlaceTable AS p
AND p.PlaceID ISNULL;
-- Un-Used media items
SELECT MediaPath || MediaFile AS media
FROM MultimediaTable AS mm
LEFT OUTER JOIN MediaLinkTable AS ml
ON mm.MediaID = ml.MediaID
WHERE ml.MediaID ISNULL;
-- Duplicate media items
SELECT COUNT(MediaID) AS c, MediaPath, MediaFile
GROUP BY MediaPath, MediaFile
WHERE c > 1;
SQLite Extensions in RMSanity
VB versions of the two extensions are:
<SQLiteFunction(Name:="RMNOCASE", FuncType:=FunctionType.Collation)> _
Friend Class NoCaseCollation
Public Overrides Function Compare(ByVal param1 As String, ByVal param2 As String) As Integer
Return String.Compare(param1, param2, True)
<SQLiteFunction(Name:="MediaFileExists", FuncType:=FunctionType.Scalar)> _
Friend Class MediaFileExists
Public Overrides Function Invoke(ByVal args() As Object) As Object
Usage of RMSanity
When successfully installed, the RMSanity app will be found in the Start Menu. Starts quickly, with a blank screen Windows Console followed shortly by an Open File dialog filtered to show RootsMagic (*.rmgc) database files. Open any database and then an Open File dialog opens to name and locate the RMSanity log file. On completing that step, RMSanity proceeds to run its seven tests:
- Broken media links
- Non-privatized SSNs
- Orphaned place details
- Unused media items
- Unused place details
- Unused places
- Duplicate media items
If all tests pass, the console window screen looks similar to this:
|RMSanity: all tests passed.|
As written, the application stops testing with the first test failed:
Each failed test then lists the information with which you can find and fix the problem using RootsMagic. In the above example, the individual with RIN 120 (I120) has a SSN fact that has not been privatized and so would be exported in the GEDCOM possibly to a website. Clearing all reported problems is Tom's objective before exporting to his webtrees website. The log file contains exactly what you see on the screen so that you can refer to it as you work through the problems.
To close the application, DO NOT USE the windows X if you do not wish to be alarmed by an error message. Type 'exit' into the RMSanity screen and press Enter.
Download and Installation
The first thing Tom says is that he developed RMSanity for his own use and purposes. If it is of use to anyone else, great, but he cannot support it. Use at your own risk! Risk is negligible because it is a read-only application that does not modify the database.
That said, this Tom thought it would be of interest to some and perhaps stimulate some others into further development of utilities in support of our use of RootsMagic. That RMSanity is in Visual Basic while the intent has been to develop the Bundled Utilities in Visual C# ought not be a deterrent - concepts are transferable if code is not. Those more comfortable with Visual Basic may find that the RMSanity source code is a great place to start.
RMSanity uses the version of SQLite that requires Windows .NET 4 Framework (not just the Client). Install .NET 4 Framework
Extract all the files under the Publish folder to a temporary folder. Find and click on setup.exe in the temporary folder. The installation will add a RMSanity folder to your start menu with a link to rmsanity.exe.
Probably the same problem and solution as described for RMtrix
Open Module1.vb to view the Visual Basic source code.