SELECT SUM(revenue) FROM orders
Every so often I wish I could write a similar query for strings. For example, wouldn't it be neat if we could get a list of employees by department like this?
SELECT TO_LIST(name), department FROM employees GROUP BY department
As it turns out, there's a way to do this, but it's a little bit tricky. You have to use an XML feature to do something it wasn't really designed to do. By converting results into XML and back, you can produce lists exactly the way you might want. Here's what the query looks like:
SELECT department
, SUBSTRING(
(SELECT ', ' + name
FROM employees e2
WHERE e2.department = e1.department
FOR XML PATH(''))
, 2, 999) AS list_of_names
FROM employees e1
GROUP BY department
Why does this work? First, you're doing a main query that just lists all the departments. So you get a list like this:
department
-----------
Sales
Accounting
Development
Then, for each record in this list, we do a subquery listing "', ' + name" for each record. That produces these results internally:
field1
-----------
, Bob
, Chris
, Alex
The XML path with a blank parameter turns that into a gigantic text string, like so:
field1
------------------
, Bob, Chris, Alex
And of course the Substring removes the first two characters. The end result is that you get exactly the list you expect!
| permalink | related link |




( 3 / 233 )So I typically have to insert large data sets into our database. I partner with a large number of companies that send us dozens of megabytes of data daily up to gigabytes. Most of this data also needs parsing: we need to execute logic on each record before it gets pumped into the database. This prevents us from using bulk data loaders except in the most basic cases.
I'm aware of a few different ways of loading data into a SQL database; each of which has positives and negatives.
Insert using SqlCommand and SqlParameter
This is typically the slowest approach, but it works well, it's safe from SQL injection attacks, and it can be used for any number of rows from one to one million. Here's a sample insert for 10,000 rows:
SqlConnection conn = new SqlConnection(connection_string);
conn.Open();
SqlCommand cmd = new SqlCommand("INSERT INTO test_table (var1, var2, var3, str1, str2) VALUES (@var1, @var2, @var3, @str1, @str2)", conn);
cmd.Parameters.Add(new SqlParameter("@var1", var1));
cmd.Parameters.Add(new SqlParameter("@var2", var2));
cmd.Parameters.Add(new SqlParameter("@var3", var3));
cmd.Parameters.Add(new SqlParameter("@str1", str1));
cmd.Parameters.Add(new SqlParameter("@str2", str2));
try {
for (int i = 0; i < 10000; i++) {
cmd.Parameters["@var1"].Value = var1;
cmd.Parameters["@var2"].Value = var2;
cmd.Parameters["@var3"].Value = var3;
cmd.Parameters["@str1"].Value = str1;
cmd.Parameters["@str2"].Value = str2;
cmd.ExecuteNonQuery();
}
} finally {
conn.Close();
}
Insert using Stored Procedures
This is a bit faster than the basic method, since the SQL server doesn't have to recompile a command each time it gets executed. However, it's still fairly slow since each insert is a separate command.
SqlConnection conn = new SqlConnection(actual_string);
conn.Open();
// Create the command string
SqlCommand cmd = new SqlCommand("EXEC insert_test @var1, @var2, @var3, @str1, @str2", conn);
// Iterate through all of the objects
try {
for (int i = 0; i < 10000; i++) {
cmd.Parameters.Clear();
cmd.Parameters.Add(new SqlParameter("@var1", var1));
cmd.Parameters.Add(new SqlParameter("@var2", var2));
cmd.Parameters.Add(new SqlParameter("@var3", var3));
cmd.Parameters.Add(new SqlParameter("@str1", str1));
cmd.Parameters.Add(new SqlParameter("@str2", str2));
// Read in all the data
cmd.ExecuteNonQuery();
}
} finally {
conn.Close();
}
Insert using Stored Procedure and Transactions
This combines multiple best practices for SQL database management. First, it uses parameters to protect against SQL injection attacks. Next, it uses stored procedures, which isolates your database from your C# code and keeps the data layer separate from the business layer. Third, it uses transactions to limit the length of time the database is hit by locking. It's also faster!
// Create the database connection
SqlConnection conn = new SqlConnection(actual_string);
conn.Open();
SqlTransaction trans = conn.BeginTransaction();
// Create the command string
SqlCommand cmd = new SqlCommand("EXEC insert_test @var1, @var2, @var3, @str1, @str2", conn);
cmd.Transaction = trans;
// Iterate through all of the objects
try {
for (int i = 0; i < 10000; i++) {
cmd.Parameters.Clear();
cmd.Parameters.Add(new SqlParameter("@var1", var1));
cmd.Parameters.Add(new SqlParameter("@var2", var2));
cmd.Parameters.Add(new SqlParameter("@var3", var3));
cmd.Parameters.Add(new SqlParameter("@str1", str1));
cmd.Parameters.Add(new SqlParameter("@str2", str2));
// Read in all the data
cmd.ExecuteNonQuery();
// Periodically commit the transaction
if (i % 1000 == 999) {
trans.Commit();
trans = conn.BeginTransaction();
cmd.Transaction = trans;
}
}
} finally {
if (trans != null) trans.Commit();
conn.Close();
}
Insert using Union All
This is a really clever approach that is blindingly fast, but it requires that you manually escape your text to prevent SQL injection attacks. Here's some sample code:
SqlConnection conn = new SqlConnection(actual_string);
conn.Open();
// Build a massive insert statement
try {
StringBuilder sql;
for (int i = 0; i < 100; i++) {
sql = new StringBuilder();
sql.Append("INSERT INTO test_table (var1, var2, var3, str1, str2) SELECT \r\n");
for (int j = 0; j < 100; j++) {
sql.AppendFormat("{0}, {1}, {2}, '{3}', '{4}' UNION ALL SELECT \r\n",
var1, var2, var3, str1.Replace("'", "''"), str2.Replace("'", "''")
);
}
// Subtract the last "union all select"
sql.Length -= 19;
// Insert this batch
SqlCommand cmd = new SqlCommand(sql.ToString(), conn);
cmd.ExecuteNonQuery();
}
} finally {
conn.Close();
}
Insert using XML Parsing
This is a clever trick I've seen that seems to have roughly the same performance as a Union All, but can work with big XML documents.
List<string> values = new List<string>();
for (int i = 0; i < 10000; i++) {
values.Add(String.Format("<var1>{0}</var1><var2>{1}</var2><var3>{2}</var3><str1>{3}</str1><str2>{4}</str2>", var1, var2, var3, str1, str2));
}
string xmlString = XMLUtilities.BuildSQLXmlString("Items", "Item", values.ToArray());
// Create the database connection
SqlConnection conn = new SqlConnection(actual_string);
conn.Open();
SqlCommand cmd = new SqlCommand(@" DECLARE @XML XML
SELECT @XML = @xmlString
DECLARE @Mapping TABLE (
var1 INT,
var2 INT,
var3 INT,
str1 NVARCHAR(255),
str2 NVARCHAR(20)
)
INSERT INTO @Mapping(var1,var2,var3,str1,str2)
SELECT ParamValues.Item.query('var1').value('.','INT')
, ParamValues.Item.query('var2').value('.','INT')
, ParamValues.Item.query('var3').value('.','INT')
, ParamValues.Item.query('str1').value('.','NVARCHAR(255)')
, ParamValues.Item.query('str2').value('.','NVARCHAR(20)')
FROM @XML.nodes('/Items/Item') AS ParamValues(Item)
INSERT INTO test_table(var1,var2,var3,str1,str2)
SELECT m.var1,m.var2,m.var3,m.str1,m.str2
FROM @Mapping m", conn);
cmd.CommandTimeout = 10000;
cmd.Parameters.Add(new SqlParameter("@xmlString", xmlString));
// Iterate through all the items we're going to create
try {
cmd.ExecuteNonQuery();
} finally {
conn.Close();
}
Relative Performance
Bet you're wondering how they stack up! Here are the numbers for a run of 10,000 records:
Insert with Command/Parameter - 64.48 seconds
Insert Stored Procedure - 58.40 seconds
Insert Stored Procedure using BeginTransaction - 16.03 seconds
Insert Union All using SQLParameters - 3.18 seconds
Insert Union All using escaped text - 1.42 seconds
Insert XML - Occasional Crash!
Recommendations
If you're writing general purpose code, just use the plain old stored procedure approach, and use a transaction if you do multiple inserts at a time. It's guaranteed to be reliable, it's guaranteed to be a bit faster than the typical method, and it's all around good practice.
However, if you do need absolute speed, I highly recommend the Union All approach. The best part is, if you switch to a different database, most other RDBMS programs support something similar to Union All so your code will directly convert over!
| permalink | related link |




( 2.9 / 264 )Up until now, my team has been using Visual Studio for all our test and build needs. We've used Visual Studio completely - test suites, solution files, etc. Recently we decided to start using CruiseControl.NET to put ourselves on the continuous integration path. What's the best way to get started quickly on CruiseControl without changing our workflow?
On the positive side, we use Subversion for our version control, which is supported right out of the box. It's very easy for us to create a new account and flag it for use of CruiseControl.net, because we run VisualSVN Server and we have active directory authentication set up.
Then the sacrifices came into play. We decided against using NANT to build our code. We have been successfully using Visual Studio 2008 to build / edit / test, and we didn't want to have to support both the Visual Studio style Solution / Project files (*.csproj, *.sln) as well as the NANT files. So that solution didn't work. Similarly, since we're using MSTEST (Microsoft's built-in test suite stuff), NUnit doesn't do everything we want.
Shame on us for using proprietary vendor code :) Anyways, turns out there is a good solution. Here's how it works.
1) Set up a desktop machine OR a Virtual PC image / VMWare image. We went with the Virtual PC solution since I could just run it on my local desktop. Microsoft publishes some stock Virtual PC Images of Windows XP that you can use.
2) Install the following programs - Subversion client, Visual Studio 2008, and CruiseControl.NET.
3) Create a new Subversion account on your repository named "CruiseControl", and grant it read-only access.
4) Edit your CCNet.config file (c:\program files\cruisecontrol.net\server\ccnet.config) to set up the project, the test solution, and launch CruiseControl as a service. We chose to send out build emails since we're still adopting the service and we need to get people used to it; eventually I'm sure folks will use the tray application.
5) You should probably copy up the files from your CruiseControl folder to your intranet server in a location where everyone can browse it. That helps you see statistics and build results.
Here's my sample CCNet.config file:
<cruisecontrol xmlns:cb="urn:ccnet.config.builder">
<queue name="Q1" duplicates="ApplyForceBuildsReplace"/>
<project name="DMT" queue="Q1" queuePriority="1">
<webURL>your intranet web server URL here</webURL>
<workingDirectory>a temporary directory here</workingDirectory>
<artifactDirectory>a different temporary directory here</artifactDirectory>
<modificationDelaySeconds>10</modificationDelaySeconds>
<triggers>
<intervalTrigger seconds="60" name="continuous" />
</triggers>
<sourcecontrol type="svn">
<executable>c:\program files\subversion\bin\svn.exe</executable>
<trunkUrl>your subversion repository location here</trunkUrl>
<workingDirectory>Where the code will go on your build machine</workingDirectory>
<username>cruisecontrol</username>
<password>your password here</password>
<revert>true</revert>
<cleanUp>true</cleanUp>
<tagOnSuccess>false</tagOnSuccess>
</sourcecontrol>
<tasks>
<msbuild>
<executable>c:\windows\microsoft.net\framework\v3.5\msbuild.exe</executable>
<workingDirectory>The application directory here</workingDirectory>
<projectFile>The name of the solution file</projectFile>
<targets>build</targets>
<timeout>300</timeout>
</msbuild>
<exec>
<executable>cmd.exe</executable>
<buildArgs>/c del /s/q c:\mstest-results.trx</buildArgs>
<description>This clears out the data from the previous MSTEST.EXE run.</description>
</exec>
<exec>
<executable>C:\program files\microsoft visual studio 9.0\common7\ide\mstest.exe</executable>
<buildArgs>/testcontainer:name of your test DLL file here /resultsfile:c:\mstest-results.trx</buildArgs>
<description>Run unit tests for your application and save the results to c:\mstest-results.trx</description>
</exec>
</tasks>
<publishers>
<merge>
<files>
<file>c:\mstest-results.trx</file>
</files>
</merge>
<xmllogger />
<statistics />
<modificationHistory onlyLogWhenChangesFound="true" />
<email from="fake send-from email address"
mailhost="smtp server goes here"
mailport="25"
mailhostUsername="cruisecontrol"
mailhostPassword="password goes here"
includeDetails="true">
<users>
<user name="Build List" group="buildmaster" address="your distribution list email address goes here" />
</users>
<groups>
<group name="buildmaster" notification="always"/>
</groups>
<modifierNotificationTypes>
<NotificationType>Failed</NotificationType>
<NotificationType>Fixed</NotificationType>
</modifierNotificationTypes>
<subjectSettings>
<subject buildResult="StillBroken" value="Build is still broken for {CCNetProject}" />
</subjectSettings>
</email>
</publishers>
</project>
</cruisecontrol>
| permalink | related link |




( 2.9 / 267 )In the old days I used to use the "AT" command to schedule simple recurring tasks for Windows. The "AT" command was simple enough but had some limitations, and for whatever reason it never seemed reliable, although maybe that was because I never treated it seriously.
As it turns out, Microsoft has replaced that with a new command called "SCHTASKS". Try this:
Start | Run
CMD.EXE
schtasks /query
You should see a message like this:
INFO: There are no scheduled tasks present in the system.
This just tells you that you do not have any scheduled tasks. Apparently, though, schtasks lets you set them up via the command line! Very useful. Of course, the alternative is to go Start | Control Panel | Scheduled Tasks and use the GUI; however I find many situations where I like to have the ability to execute changes automatically using the command line.
For more instructions on SCHTASKS, type "SCHTASKS /?" and you'll get a help page. The instructions seem good enough, so enjoy!
EDIT: Also note that when you're using SCHTASKS, quotes are escaped using the backslash. Here's an example of an escaped nested command:
schtasks /create /ru Administrator /rp 33d@r1294 /SC DAILY /TN "Download Scripts" /TR "C:\GamePulse\TaskRun.exe \"cmd.exe\" \"/c c:\gamepulse\running\download-scripts.cmd\" log %CUSTID% \"Download Scripts\""
| permalink | related link |




( 2.8 / 228 )Users of my applications like to get reports directly in native Excel so they can use them quickly. I typically save files in CSV, which is fast and reliable but Excel doesn't support it as thoroughly as it should.
On the other hand, using Excel automation to populate a file takes an eternity and doesn't work well on a server. So here's a nice middleground - I can save the file to an XML/HTML format that Excel supports natively.
This strategy solves a few problems:
1) Performance - This format is just as fast to create as CSV.
2) Dependence - Does not require using any Excel libraries.
3) Compatibility - Excel opens the file and it looks properly formatted and usable.
4) Data Integrity - CSV files tend to damage numeric or date values if they are formatted in a way Excel does not expect; this format converts everything to text which means the values are represented exactly.
You also have the option of using some of Microsoft Excel's number formatting tools. These are surprisingly hard to find documented anywhere! But, here they are:
http://www.ozgrid.com/Excel/CustomFormats.htm
Without further ado, here's the code:
/// <summary>
/// Save a DataTable to an XML file
/// </summary>
/// <param name="filename"></param>
/// <param name="dt"></param>
public static void SaveDataTableToExcelCompatibleFile(string filename, DataTable dt)
{
StreamWriter sw = null;
try {
// Create the file and write the header
sw = new StreamWriter(filename);
sw.WriteLine(@"<head><META http-equiv=""Content-Type"" content=""text/html; charset=UTF-8""></head><table>");
// Write column names
sw.Write("<tr>");
for (int i = 0; i < dt.Columns.Count; i++) {
sw.Write("<th filter=\"all\">");
sw.Write(dt.Columns[ i ].ColumnName);
sw.Write("</th>");
}
sw.Write("</tr>");
// Write data
for (int i = 0; i < dt.Rows.Count; i++) {
DataRow dr = dt.Rows[ i ];
sw.Write("<tr>");
for (int j = 0; j < dt.Columns.Count; j++) {
sw.Write("<td style=\"vnd.ms-excel.numberformat:@\">");
sw.Write(dr[j].ToString());
sw.Write("</td>");
}
sw.Write("</tr>");
}
// Write footer
sw.WriteLine(@"</table><br>
<br>Confidential Information - Do Not Distribute");
} finally {
if (sw != null) sw.Close();
}
}
| permalink | related link |




( 3 / 201 )Back Next

Calendar



