Cloudberry Sql Msdb Not All Chunks Were Uploaded
- Download TestXML_Script.naught - 856 B
Introduction
A previous article I wrote on 'DBA Skills for Developers' gave a round up of various tips and tricks to make the life of a programmer who had 'inherited' the task of in-firm DBA hopefully a chip better. I was going to expand that article and add data on importing information to SQL server, simply decided it is a topic and article that will abound over time so I decided to break it out on its on - and this is it.
Importing data to SQL Server
While the wonderful astonishing database those lovely people who pay you lot designed trundles forth and does its thing, generating its own data from simple in-app crud and pulling data possibly from web-services and other absurd sources, the twenty-four hour period will come up when you are presented with a big lump of flat-file manner information in the format mayhap of CSV or XML, and told - import that delight.... #groan
I have establish that its by and large improve to work with the tools provided past the clever people who designed and maintain MS SQL server. For this reason, when presented with apartment file information, I tend to turn to iii methods of importing data depending on the blazon and format of data.
(1) Bulk CSV data loader
Majority Insert is a transact SQL statement that allows you to call and import data from the SQL command line. Typically I apply this inside stored procedures that are either manually called or triggered on a schedule.
Here's an example... lets assume we have the following table already in-place:
create tabular array SomeImportTestTable (ID int, PersonsName varchar(50), Country varchar(100))
and a file called 'c:\import\testdata.csv' that contains data like this:
i, AJSONDEV, United Kingdom 2, Mahabir, India iii, Joanna, Netherlands
To get this data in and sitting in our tabular array, nosotros call bulk import, like this...
bulk insert SomeImportTestTable from ' c:\import\testdata.csv' with (Fieldterminator = ' ,' RowTerminator = ' \n')
Information technology will import as expected into our table
ID PersonsName Land ---------------------------------------------- 1 AJSONDEV United Kingdom two Mahabir India 3 Joanna Netherlands
You will have seen above, that there are options to allow you to specify the delimiter between fields, line terminator, etc. Learn more well-nigh Bulk CSV loader hither: https://msdn.microsoft.com/en-GB/library/ms188365.aspx
(ii) BCP information import utility
Sometimes, y'all can't become within the database to build and run a script for whataver reason - in this case, you tin can call on bulk imports best friend, the BCP utility. This piffling nugget is a control line utility that basically does the same thing every bit the bulk import statement (I have read somewhere that it calls the majority import utility but I've not confirmed that thus far). BCP can exist used by not only those who speak the dark tongue of SQL, just pretty much anyone who can hack together a command line script - the reason for this is that it is very much command line/switch driven.
Bringing data from a CSV file into your database is a snap...
bcp MyDatabaseName.SomeImportTestTable in c:\import\testdata.csv -t, -U MyLoginName
Prety uncomplicated - call the utility name, followed by the name of the database/table y'all wish to import into, the keyword 'IN' (thats the management!), the '-t' tells the utility its tab delimited, and the '-U' says 'next token is my username for login'. One time you click <enter>, the utility will ask for your password and then execute.
As well as using the BCP utility to bring data into a database, yous tin can as well employ it to go data out. This MSDN case illustrates:
bcp AdventureWorks2012.Sales.Currency out " Currency Types.dat" -T -c
Larn more almost the BCP utility at this link: https://msdn.microsoft.com/en-gb/library/ms162802.aspx
NB: Gotchas with importing CSV information
You know that feeling when you acquire a new skill or tool and you are on the honeymoon menstruum and everything is rosy in the garden and y'all deceit wait to get upwards and smell the roses and boy oh male child are they simply wonderful ... <breathe!!> ... UNTIL It ALL Just GOES FOOBAR !!!! #OhDear .... right - heres some difficult lessons learned about importing CSV style information into SQL...
(a) sentinel the integrity of your delimiters ... that is, ensure every object that needs to be wrapped with a comma is wrapped, and every record object that needs to be separated by a comma is separated ... also (!!!!!!) ... look for any information that *breaks* the format ... by this I hateful a row using single quotes to isolate information tokens, that are effectively broken by the user data containing the same delimiter somewhere..
case - working: '1','AJSONDEV','United Kingdom'
example - NOT working: '1','AJSON O'DEV','U.k.'
in the non working case above, notation that AJSON has gone all Irish gaelic on united states of america and is now Ajson O'Dev (begorrah!) ... unfortunately, the single quote separating the "O" and "Dev" is as well the delimiter and so the import will break.
(b) watch the data quality being imported and sentinel out for any odd looking characters (non expected not standard ascii for example) that could break the import.
So what to do about it? ... I more often than not use i of two approaches, depending ont he severity of how I see import failures as my projection progresses. Some information by its nature comes in clean, other data is only downright messy. The first approach is generally to utilize a white-listing of characters actually allowed to get imported, and pre-parse the csv file to enforce the list (eg: but let 'A..z', '0..nine'). The second approach which is a bit more work and heavy handed, is to take each token due to be imported, encode it (I have used base64 at times for this), import, and then decode. The latter yet allows whatever spurious characters to be transferred and gives me the least amount of pain once set up correctly.
(3) XML information import
SQL Server gives a lot of flexibility when dealing with XML data, only you lot take to duck and swoop at times to brand things happen. if yous have an xml file that you wish to import, where the data is neatly structured in piece of cake to get to nodes, then you are in luck - things will run very smoothly for you. If on the other mitt you lot have numerous child nodes, lots of deep trees etc, and so you have to seat a fleck to go at what you require. You lot accept the option to take an entire XML file from disk, and store that into a special table cavalcade with type XML, or, load up an XML file, query its contents, and extract these and insert them into a standard tabular array for regular apparently old sql manipulation.
Example one - unproblematic import to XML type field.
Assuming we have a tabular array with a column with type XML, then we can use the following SQL command to import a file 'testdata.txt' into that column.
Insert into MyTestImportTable(XmlColumn) select * from OpenRowSet( bulk ' c:\import\testdata.txt', single_blob ) every bit X;
(... oh goodness, looksee, its our friend 'BULK' doing the heavy lifting for united states of america again .... how cool!)
Note that the data after execution, is at present present *in XML format*, stored in i single column, and you lot still need work to extract the data and employ information technology.
Example two - import to an VarChar/Cord variable
Sometimes nosotros need to examine the XML equally text, peradventure to pre-parse something before manipulating it as XML. This instance shows how to load the file into a VarChar variable.
Select @fileData=BulkColumn from OpenRowSet(Majority ' C:\import\testdata.xml', Single_blob ) x;
The variable @fileData is at present available, as a string, to manipulate as you run across fit. To cast it back to an XML object, simply convert information technology.
Instance three - import xml data and insert individual values to tabular array row columns
Our input XML example would look like this:
<persons> <person> <id> 1 </ id > <personsname> AjsonDev </ personsname > <country> United Kingdom </ country > </ person > <person> <id> 2 </ id > <personsname> Mahabir </ personsname > <land> India </ country > </ person > <person> <id> 3 </ id > <personsname> Joanna </ personsname > <state> Netherlands </ country > </ person > </ persons >
The following SQL will open the file, and read the information into individual tape columns.
Declare @fileData XML Select @fileDataX=BulkColumn from OpenRowSet(Bulk ' C:\data\hpra\latestVMlistX.xml',Single_blob) ten; insert into SomeImportTestTable (ID, PersonsName, Country) select xData.value(' ID[1]',' int') ID, xData.value(' PersonsName[1]',' varChar(50)') PersonsName, xData.value(' Country[1]',' varChar(100)') Country, from @fileData.nodes(' /Persons/Person')
Example 4 - extracting nested xml child data and flattening it out
For this example, we are going to look at an input XML file that has nested child nodes - lets look at that showtime:
<persons> <person> <id> one </id> <personsname> AjsonDev </personsname> <country> United kingdom </country> <languages> <language> C# </linguistic communication> <language> JavaScript </linguistic communication> <language> Python </language> <language> Objective-C </language> <linguistic communication> Pascal </language> <linguistic communication> F# </linguistic communication> <language> R </language> </languages> </person> <person> <id> 2 </id> <personsname> Mahabir </personsname> <state> India </country> <languages> <language> C# </linguistic communication> <language> JavaScript </linguistic communication> <linguistic communication> C++ </language> <langauge> VB <language> SQL </language> </langauge></languages> </person> <person> <id> 3 </id> <personsname> Joanna </personsname> <state> Netherlands </country> <languages> <language> Dutch </linguistic communication> <languages> </languages></languages></person> </persons>
What nosotros want to achieve, is to extract the cadre data (Name, Country), but instead of having to construct some other table (i:M) to store the languages, or put them into an XML field in our table, nosotros determine to flatten the languages out by comma delimiting into ane field. Like most things in development, there are many ways to arroyo this. In this example, I show how to use xpath navigation to start go deep and extract all of the kid nodes in one line, then use '..' notation to walk support the xml tree and option out the parent data we require.
insert into SomeImportTestTable (ID, PersonsName, Country, Languages) SELECT xData.value(' ../ID[i]',' int') ID, xData.value(' ../PersonsName[1]',' VARCHAR(50)') PersonsName, xData.value(' ../Country[1]',' varchar(100)') State, xData.value(' (.)[1]', ' varchar(chiliad)') FROM @fileDataX.nodes(' ./Persons/Person/Languages') as x(xData)
When the data is extracted, it will go as normal into the ID,Person and Land fields, however, the 'Languages' data volition include each language, separated by a line-break. In this case to delineate the data, I may choose to separate out the tokens with a comma.
NB: XML Gotcha! XML can be really simple, or as complex as yous tin can make information technology to bargain with. Sometimes at that place are things that crusade issues when trying to load up XML data using the methods described here. Recently, I had a problem with a namespace issue, where when the namespace was present, the information would not load, when it was taken out, information technology would load. It needed to be referenced correctly information technology seemed. If you encounter problems like this, you have two choices:
(a) you can reference the xml explicitly, for case, by using with XMLNameSpaces, however, for me, this didn't piece of work, so I decided to only strip out the link as I didn't demand it.
(b) My quick/dirty solution was to load the XML equally a VarChar, and so remove the NameSpace string. Having washed that, I converted the varchar back to an XML type, and proceeded to piece of work as required.
Import/export wizard
Even in the lowest version of MS SQL Server, in that location is some GUI assistance with data import where you can avoid coding (why?!! :P) ... equally you would expect still, there are some restrictions, the main beingness that you cannot salvage your ETL plan. Regardless, when you need to go something washed in a hurry, this tool is very useful.
Quote:
In SQL Server Express, Web, or Workgroup, you can run the packet that the Import and Export Wizard creates, but cannot save it. To salvage packages that the magician creates, you must upgrade to SQL Server Standard, Enterprise, Developer or Evaluation.
Using SQL Express, from the 'All programs' menu, select 'SQL Server Import and Export Magician' - its very laid out and cocky explanatory, then select a data source, edit if neccessary any parameters relating to the parsing and mapping of the incoming data, select a data destination, and allow it fly!
(i) Start wizard
(2) choose data source
(3) specify data source type
(4) edit any import parameters and mappings between source and destination
(5) confirmation source/destination
(six) Run the import!
Avant-garde BCP
Earlier in the article, we looked briefly at using the BCP utility to import adn export data. At that place is withal more to information technology than meets the eye, and if you beginning using the utility heavily, its worth delving a flake more into it to see how else information technology can assist y'all. Here are some of the switches I utilize almost frequently:
- -apacket_size
-
this tells the utility the number of bytes to (per network packet transfer) to use when exchanging data to/from the server. The default is 4096, but I have found for large datasets that increasing this size can requite increased functioning on import routines.
- -bbatch_size
-
yous can use this flag to pass through the full number of rows that should be processed for import at one time (a batch). This is *extremly*useful when you are dealing with potentially dodgy data, where an import continuously breaks and you need to (a) track it down and (b) isolate the issue so the residuum of the information tin can keep flowing. This switch ensures that should a batch fail, the rest of the process volition continue - in add-on, whatever actions carried out considering of the transaction will be rolled back.
- -eerr_file
-
I use this always in conjunction with the -b higher up. Information technology gives the utility a location where I want it to write log files of any errors encountered, information technology will output any rows that gave errors and were non imported. Big gotcha! ... if you *dont* include this when using the batch-size selection, and then a log file will not exist created and yous dont know what was left out of the import should anything go incorrect!
- -Ffirst_row
-
This one is sweet, and can be used in combination with the -L ('last row' beneath) - in fact using them both together is similar to the pattern of 'skip/take' we utilise when dealing with paging scenarios. This switch tells the utility where to kickoff importing the first row of data.
- -Llast_row
-
The otherside of first_row, it indicates what the last row number to import should be. Using -F and -50 together, permit you lot to excerpt a chunk of data in the middle of a big import file while ignoring the rest.
- -chiliadmax_errors
-
Clearly there is no point in processing a file if it is full of errors, and you may make up one's mind y'all can just have a certain number or percentage of errors before baulking out. -m allows you to specify this number. An fail occurs when a row of information cannot exist imported for whatever reason.
- -tfield_term
-
Depending on the source of your information, the field data in each row may be delimited by a tab, comma, pipe, etc. This switch allows you to tell BCP what the terminator is. The default is\t (tab grapheme).
Selective export
Sometimes I only need to take parts of data from a tabular array, and not export the entire thing. To assist with this, nosotros can use the 'queryout' switch.
examples:
bcp " SELECT Proper noun FROM SomeTable" queryout MyExportedFile.txt -T
bcp " SELECT * FROM Country WHERE PopulationMillion > ten AND ExportKGMonth > 170 " queryout " CountryExports.dat" -T
Data im/ex i due north Cloud
Every bit we movement more information and services to the cloud, we are faced with coming up different means of doing things from the desktop and traditionally hosted server. Regardless of the platform however, the requirement to import/export csv/xml/etc formatted data remains. Sometimes it may not seem obvious how to do things and you have to come up up with diverse workarounds that fit the state of affairs. Here are some things I have done to assistance with data import/consign on both Azure and other cloud hosted environments.
Setup an external drive.
1 of the solutions I use is to create a virtual drive that is external to my SQL VM, and use this as a staging point for data. I can collaborate directly with SQL to piece of work with the data on the drive, or, hook it in every bit a mapped bulldoze to the SQL VM and apply it for automated jobs such every bit fill-in. I have an article hither on scheduled backup of Azure based SQL using this method.
Remote into your drive
There are various tools available to claw into cloud storage - the one I generally use is Azure Cloudberry Explorer from Cloudberry Labs. (nb: there are versions for EC2 etc also). The freeware version is extremly capable and volition exercise well-nigh things you need for facilitating bones im/ex. If you require more power, then their pro version is a sizable step upward and is very reasonably priced for the features it has, and importantly, the time it saves you.
(the screenshot above from cloudberry shows local file explorer on the left, and the remote virtual drive on the right).
Other methods
There are a number of other approaches to loading data into SQL server - I volition embrace them off in a later update to this article. Amongst those to consider would be SSIS, which allows you lot to load blocks of data in parallel, always a good affair when you are dealing with large volumes of data
Hither are some interesting resources worth checking out about SSIS:
- How to Create your Kickoff SQL Server Integration Services (SSIS) Bundle - Part 1
- Import Images to SQL Server using SSIS
I have attached both an SQL script and test XML data to demonstrate the code working - feel complimentary to download and try it. Finally, If you lot liked the article, delight give it a vote at the meridian of the page!
History
21 January 2016 - Version 1
x Feb 2016 - Version ii - added im/ex wizard
17 Feb 2016 - Version iii - added Azure/deject department, added section on advanced BCP im/ex
12 Apr 2016 - Version iv - Updated code, added sample SQL download script
Source: https://www.codeproject.com/Articles/1073184/How-to-import-data-into-MS-SQL-server-from-CSV-and
0 Response to "Cloudberry Sql Msdb Not All Chunks Were Uploaded"
Post a Comment