Go Back   TalkBass Forums > Bass Guitar Forums > Bass Guitar Forums > Off Topic [BG]
Register Rules/FAQ/CUP Members List Search Today's Posts Mark Forums Read

Off Topic [BG] Non-music-related discussion and chat


Supporting Membership
Thank You

Latest Supporting Member
Donate to Upgrade Today

Reply
 
Thread Tools Search this Thread
  #1  
Old 09-16-2011, 01:14 PM
mattsk42's Avatar
$100 off new Directv subsp.PM me BEFORE signing up
 
Join Date: Oct 2004
Location: SiouxFalls by way of Pierre,SD
Send a message via AIM to mattsk42 Send a message via MSN to mattsk42 Send a message via Yahoo to mattsk42
Supporting Member
Exclamation Data Conversion Help (anyone do this for a job?)

Sign in to disble this ad
I'm just starting a new job where PART of my job is to be a "Conversion Specialist". Problem is, I no nothing about it. I have never programmed or anything related, it's been all IT or graphic design. I'm learning SQL as much as I can.

What do I need to know, and what does it really involve? This will be for converting a sheriff's office database records TO our software that handles those records.

We have a program called "sypherlink harvester" to help, but I don't really see anything I recognize when looking at the specs for it.
__________________
Subscribe/buy Bass Gear Magazine
www.bassgearmag.com

Spector Club #231

Last edited by mattsk42 : 09-16-2011 at 01:24 PM.
  #2  
Old 09-16-2011, 01:22 PM
Selta's Avatar
www.HeavyMetalOpera.com

Unofficialy endorsing EBMM, Avatar Speakers
 
Join Date: Feb 2002
Location: Seattle (ish), WA
Send a message via AIM to Selta Send a message via MSN to Selta Send a message via Yahoo to Selta
Supporting Member
Wait... you took on a job that you know nothing about?
__________________
Sterling 5 HH / Bongo 6 HS / Sterling 5 H
|
V

SansAmp RPM
|
V
FOH

Yes, I wear kilts from Utilikilt
  #3  
Old 09-16-2011, 01:25 PM
mattsk42's Avatar
$100 off new Directv subsp.PM me BEFORE signing up
 
Join Date: Oct 2004
Location: SiouxFalls by way of Pierre,SD
Send a message via AIM to mattsk42 Send a message via MSN to mattsk42 Send a message via Yahoo to mattsk42
Supporting Member
Fixed.
__________________
Subscribe/buy Bass Gear Magazine
www.bassgearmag.com

Spector Club #231
  #4  
Old 09-16-2011, 02:00 PM
Selta's Avatar
www.HeavyMetalOpera.com

Unofficialy endorsing EBMM, Avatar Speakers
 
Join Date: Feb 2002
Location: Seattle (ish), WA
Send a message via AIM to Selta Send a message via MSN to Selta Send a message via Yahoo to Selta
Supporting Member
So, do you know at all what kind of DB it even is?
__________________
Sterling 5 HH / Bongo 6 HS / Sterling 5 H
|
V

SansAmp RPM
|
V
FOH

Yes, I wear kilts from Utilikilt
  #5  
Old 09-16-2011, 10:44 PM
Registered User
 
Join Date: Apr 2007
Location: Finland (Northern Europe)
Hi.

LOL.

If You have absolutely no idea how to do it, DO NOT EVEN TRY. Outsource the conversion.

With (semi?)sensitive data, someone with no expertise can do a world of harm.


I once used a whole working week in order to find a safe and reliable transfer method of spare part inventory of a certain manufacturer to our new system. Since it only involved changing the syntax, and it had to be usable for novices as well, I did that with Word and Exel.

The "expert" chose to ignore my work and transferred the parts inventory 4 times to the system with different syntaxes, ending up with 100000 false codes, all of those had to be removed with typing the 15 digit or so part number. One by one.

Good luck, Youll def. need it.

Regards
Sam
  #6  
Old 09-18-2011, 08:06 PM
Registered User
 
Join Date: Aug 2005
Location: Seattle, WA
I do this all the time. It's just a matter of mapping the old data format to the new data format in a consistent way. For me, it's one of the more straightforward types of software development jobs.
  #7  
Old 09-18-2011, 10:18 PM
Registered User
 
Join Date: Apr 2011
Location: Bangkok Thailand
Data Quality

Quote:
Originally Posted by seventhson View Post
I do this all the time. It's just a matter of mapping the old data format to the new data format in a consistent way. For me, it's one of the more straightforward types of software development jobs.
I would add that in addition to the mapping, profiling the data would be useful so that in your mapping you can identify the valid ranges and values for each of the data elements in the mapping document / data dictionary. Then you can use this information to code your conversion to do the appropriate checks.

Profiling is useful so you can get some idea of how clean or how dirty is the data. Data quality will have a direct effect on the ease or difficulty of the conversion after the data element mapping.

When dealing with data quality I use the following guide:
1. Assume the system you are working has dirty data and plan accordingly,
2. When the customer (whether internal or external) says don't worry, the data is / should be clean, see number 1.

I've found that the common hangup in most interface programs I've written in the past were date format processing. Different systems and different databases tolerate (or don't) date formatting differently and this is usually where I have had issues on most every project. One example: '2011-1-1' is acceptable as Jan 1 2011 in some systems but others must have '2011-01-01' if using that format and need to make sure you have those leading zeros. There are many more issues and many ways to skin the cat too numerous to enumerate but make sure the issues are well understood and handled in your conversion. Experience is your best friend here.

Another issue is decimal precision. One example: Some database and software systems will go to 18 decimal places in precision and others 30 decimal places. Not a big deal for many systems, but I was working on a system once where the source went out to 30 decimal places and the target handled only 18 decimal places and had to deal with a solution for that.

Don't get me wrong, I've seen crap in other data domains that were tricky, it just seemed that every one at least had the date issues to worry about.

Referential Integrity is a super important data issue. An example, if your Sheriff system has a table that stores violations and another table that stores perpetrators, and the rule is that every violation has a perp, there needs to be a perp in the perp table for every perp that's in the violation table. For really active perps, you may have many different violations with the same perp, but you would only want one instance perp in the perpetrator table. This is called a one to many relationship. Where issues arise is that sometimes you may have a perp in the violation table but for some reason is not in the perp table. This is called an orphan record by some people. Need to be able to handle this issue as some systems are poor at enforcing referential integrity. Best to try to find these in the data profile stage so it can be presented to the customer for clarification and direction on what actions should be taken.

Another issue is the rank of data importance. In systems there are some data elements absolutely cannot have mistakes and then there is other less-important elements that can be a little more lenient. The ideal is that all data elements should be perfect. All customers want it perfect, but none seem to want pay for the time it takes to correct, especially if they have crappy data. Every project has a time and budget and it's best to find out up front what absolutely has to be dead on so that those tier 1 data elements gets done correctly for sure and use the rest of the time to make sure the tier 2 and tier 3 data elements have acceptable quality. Please note I am saying prioritize to the most important. I am not saying skip the less important. There is a big difference.

Bottom line is that there will be much up front work to do before you start programming:
1. Mapping source to target data elements
2. Profiling source data
3. Review results of profile with the customer so both parties can understand the level of data quality and how that affects the actual processing.
4. Extract exception rules from results of the data profiling and add to the mapping document for use later in coding.
5. Get an understanding from the customer that for future data that comes in and falls outside of what is acceptable, what to do with that data? Keep it and do nothing to it, reject and put into a different file for review, etc? Put that information into the mapping document for future coding.

This seems like a lot for a conversion, and if we are talking about a handful of data elements you will be processing then it may be. My background is in multi-terabyte database processing on millions to billions of records so profiling is very important part of interface processing to understand how to handle anomalies because there is no way we can hand correct that volume of data if/when mistakes are uncovered.

I'll post more as I have time.
__________________
Fender J-Bass | Carvin B40 | Yamaha BB1000S
Ampeg SVT-7/8 Pro | TCE RH750 | Anchak fEARful 15/6/1

Last edited by Scogman : 09-18-2011 at 10:36 PM.
  #8  
Old 09-18-2011, 11:01 PM
guitar<bass's Avatar
The ever-so-useless
 
Join Date: Oct 2005
Supporting Member
I would use MS Paint. It's in Start/Programs/Accessories.

You'll get it.
__________________
AKA: onyx_riddle Redneck Bassist #54
  #9  
Old 09-19-2011, 04:13 AM
Registered User
 
Join Date: Aug 2011
Location: Purwakarta/Jakarta, Indonesia
Quote:
Originally Posted by mattsk42 View Post
I'm just starting a new job where PART of my job is to be a "Conversion Specialist". Problem is, I no nothing about it. I have never programmed or anything related, it's been all IT or graphic design. I'm learning SQL as much as I can.

What do I need to know, and what does it really involve? This will be for converting a sheriff's office database records TO our software that handles those records.

We have a program called "sypherlink harvester" to help, but I don't really see anything I recognize when looking at the specs for it.
Nothing that can't be solved with a liberal dose of Perl ...
__________________
Cort C4H | Cort Action 4 | Rockwell RB-32
Bassists Who Drive Manual #151 | Club Cort #205
  #10  
Old 09-19-2011, 08:22 AM
mattsk42's Avatar
$100 off new Directv subsp.PM me BEFORE signing up
 
Join Date: Oct 2004
Location: SiouxFalls by way of Pierre,SD
Send a message via AIM to mattsk42 Send a message via MSN to mattsk42 Send a message via Yahoo to mattsk42
Supporting Member
Thanks for the long read, very interesting. Good info for me to look at/think about.
__________________
Subscribe/buy Bass Gear Magazine
www.bassgearmag.com

Spector Club #231
  #11  
Old 09-19-2011, 09:06 AM
Registered User
 
Join Date: Apr 2008
Location: Leuven, Belgium
Send a message via MSN to drteeth
As someone who dabbles in database design on a small scale, I wish you good luck.
__________________
Quote:
Originally Posted by PSPookie View Post
I bludgeon any potential attackers with my enormous e-penis.
Quote:
Originally Posted by XigXag View Post
Hunting wild vegetarians is cruel.
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Follow TalkBass on Twitter   Visit TalkBass on Facebook  

All times are GMT -6. The time now is 10:37 PM.




Copyright 2011 Talk Music Group Inc. All rights reserved.
Play guitar? Visit our new sister site TalkGuitar.com [beta]
Powered by vBulletin® Version 3.6.12
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.