Select biggest record using LINQ 
I just realized I'd hate to write out the code to sort through a list of strings and find the longest one. So I tried to do it in LINQ. Here's what I got:

List<string> words = new List<string>();
... fill in a bunch of data ...
var v = (from s in words orderby s.Length descending select s).Take(1);


The "orderby s.Length" clause sorts by string length. Descending takes longest ones first and shortest ones last. The part I didn't know was the ".Take(1)" function - that's interesting! So, basically, I get the first string returned by the select statement.

Of course, now you have to take a "var" and turn it into a string object. Here's a shortcut:
IEnumerable<string> list = v as IEnumerable<string>;
if (list != null) {
foreach (string s in list) {
return s;
}
}


  |  permalink  |  related link  |   ( 2.8 / 85 )
Windows 7 Problems 
As much as I'd welcome an upgrade to Windows, I doubt I'll be interested in what Microsoft is producing.

1) The new UI has a nifty "translucent" look to everything. However, this makes it virtually impossible to glance at text - now you have to study it carefully to make sure the background isn't distorting your words. This is part of the reason I haven't grown attached to Vista, and I'm sad to see it isn't changing much in the next release.

2) Every individual part of the default UI is now more cluttered. What used to be a simple "folder" window now has tons of options, hyperlinks, and buttons. In many cases, the icons now randomly jump in size depending on what Windows thinks you have in the folder - you can go from tiny icons to huge ones, and funny "1-5 star" ratings systems interjected if the folder contains some music or video files.

3) The control panel now attempts to hide stuff it thinks you don't want to see. This causes two problems: first, it's harder to find the correct choice to make since you have to bypass all the choices it thinks you want to see; and secondly, when you make a choice, it's hard to know if you've found the correct place to make it. For example, It takes a very long time now to find "network connections", when it used to be very simple to bring up a list of all network adapters on your computer and select the one you wanted to check.

I would be very interested in an OS that would reduce the amount of unnecessary garbage that clogs up the user interface. Ubuntu and Mac OSX both do well in this regard.
  |  permalink  |  related link  |   ( 2.8 / 240 )
Null Pointer Exceptions when Comparing Strings 
In the olden days, it was possible to compare strings like so:
string s1, s2;

if (s1 = s2) then begin
PrintLn("Strings match!");
end else begin
PrintLn("Strings don't match.");
end;

Of course, we're no longer in the Turbo Pascal era. What happens when you compare two strings using the == symbol in .net? It wraps a call to String.Equals. However, it also returns true if both strings are null!
string s1, s2;

if (s1 == s2) {
Console.WriteLine("S1 and S2 are equivalent according to String.Equals(). They could be null!");
} else {
Console.WriteLine("S1 and S2 have different values, or one is null and the other is not null.");
}

Another way to run this test is to use s1.equals(s2) to see if they match - but wait! Null pointer exceptions!
string s1 = null;
string s2 = "Test";

if (s1.Equals(s2)) {
Console.WriteLine("Poof! Null Pointer Exception!");
}

Of course, there's a simple answer. There is one and only one correct way to compare strings. Here it is:
string.Equals(s1, s2, StringComparison.CurrentCultureIgnoreCase)

That also gets rid of the "case sensitivity" bug :)
  |  permalink  |  related link  |   ( 3.2 / 286 )
SUM() Strings in Microsoft SQL server 
It's really easy to sum numeric values. You write this:
SELECT SUM(revenue) FROM orders

Every so often I wish I could write a similar query for strings. For example, wouldn't it be neat if we could get a list of employees by department like this?
SELECT TO_LIST(name), department FROM employees GROUP BY department

As it turns out, there's a way to do this, but it's a little bit tricky. You have to use an XML feature to do something it wasn't really designed to do. By converting results into XML and back, you can produce lists exactly the way you might want. Here's what the query looks like:
SELECT department
, SUBSTRING(
(SELECT ', ' + name
FROM employees e2
WHERE e2.department = e1.department
FOR XML PATH(''))
, 2, 999) AS list_of_names
FROM employees e1
GROUP BY department

Why does this work? First, you're doing a main query that just lists all the departments. So you get a list like this:
department
-----------
Sales
Accounting
Development

Then, for each record in this list, we do a subquery listing "', ' + name" for each record. That produces these results internally:
field1
-----------
, Bob
, Chris
, Alex

The XML path with a blank parameter turns that into a gigantic text string, like so:
field1
------------------
, Bob, Chris, Alex

And of course the Substring removes the first two characters. The end result is that you get exactly the list you expect!
  |  permalink  |  related link  |   ( 3 / 233 )
Fast Record Insertion into Microsoft SQL Server using C# 
So I typically have to insert large data sets into our database. I partner with a large number of companies that send us dozens of megabytes of data daily up to gigabytes. Most of this data also needs parsing: we need to execute logic on each record before it gets pumped into the database. This prevents us from using bulk data loaders except in the most basic cases.

I'm aware of a few different ways of loading data into a SQL database; each of which has positives and negatives.

Insert using SqlCommand and SqlParameter


This is typically the slowest approach, but it works well, it's safe from SQL injection attacks, and it can be used for any number of rows from one to one million. Here's a sample insert for 10,000 rows:

SqlConnection conn = new SqlConnection(connection_string);
conn.Open();
SqlCommand cmd = new SqlCommand("INSERT INTO test_table (var1, var2, var3, str1, str2) VALUES (@var1, @var2, @var3, @str1, @str2)", conn);
cmd.Parameters.Add(new SqlParameter("@var1", var1));
cmd.Parameters.Add(new SqlParameter("@var2", var2));
cmd.Parameters.Add(new SqlParameter("@var3", var3));
cmd.Parameters.Add(new SqlParameter("@str1", str1));
cmd.Parameters.Add(new SqlParameter("@str2", str2));
try {
for (int i = 0; i < 10000; i++) {
cmd.Parameters["@var1"].Value = var1;
cmd.Parameters["@var2"].Value = var2;
cmd.Parameters["@var3"].Value = var3;
cmd.Parameters["@str1"].Value = str1;
cmd.Parameters["@str2"].Value = str2;

cmd.ExecuteNonQuery();
}
} finally {
conn.Close();
}


Insert using Stored Procedures


This is a bit faster than the basic method, since the SQL server doesn't have to recompile a command each time it gets executed. However, it's still fairly slow since each insert is a separate command.

SqlConnection conn = new SqlConnection(actual_string);
conn.Open();

// Create the command string
SqlCommand cmd = new SqlCommand("EXEC insert_test @var1, @var2, @var3, @str1, @str2", conn);

// Iterate through all of the objects
try {
for (int i = 0; i < 10000; i++) {
cmd.Parameters.Clear();
cmd.Parameters.Add(new SqlParameter("@var1", var1));
cmd.Parameters.Add(new SqlParameter("@var2", var2));
cmd.Parameters.Add(new SqlParameter("@var3", var3));
cmd.Parameters.Add(new SqlParameter("@str1", str1));
cmd.Parameters.Add(new SqlParameter("@str2", str2));

// Read in all the data
cmd.ExecuteNonQuery();
}
} finally {
conn.Close();
}


Insert using Stored Procedure and Transactions


This combines multiple best practices for SQL database management. First, it uses parameters to protect against SQL injection attacks. Next, it uses stored procedures, which isolates your database from your C# code and keeps the data layer separate from the business layer. Third, it uses transactions to limit the length of time the database is hit by locking. It's also faster!

// Create the database connection
SqlConnection conn = new SqlConnection(actual_string);
conn.Open();
SqlTransaction trans = conn.BeginTransaction();

// Create the command string
SqlCommand cmd = new SqlCommand("EXEC insert_test @var1, @var2, @var3, @str1, @str2", conn);
cmd.Transaction = trans;

// Iterate through all of the objects
try {
for (int i = 0; i < 10000; i++) {
cmd.Parameters.Clear();
cmd.Parameters.Add(new SqlParameter("@var1", var1));
cmd.Parameters.Add(new SqlParameter("@var2", var2));
cmd.Parameters.Add(new SqlParameter("@var3", var3));
cmd.Parameters.Add(new SqlParameter("@str1", str1));
cmd.Parameters.Add(new SqlParameter("@str2", str2));

// Read in all the data
cmd.ExecuteNonQuery();

// Periodically commit the transaction
if (i % 1000 == 999) {
trans.Commit();
trans = conn.BeginTransaction();
cmd.Transaction = trans;
}
}
} finally {
if (trans != null) trans.Commit();
conn.Close();
}


Insert using Union All


This is a really clever approach that is blindingly fast, but it requires that you manually escape your text to prevent SQL injection attacks. Here's some sample code:

SqlConnection conn = new SqlConnection(actual_string);
conn.Open();

// Build a massive insert statement
try {
StringBuilder sql;
for (int i = 0; i < 100; i++) {
sql = new StringBuilder();
sql.Append("INSERT INTO test_table (var1, var2, var3, str1, str2) SELECT \r\n");
for (int j = 0; j < 100; j++) {
sql.AppendFormat("{0}, {1}, {2}, '{3}', '{4}' UNION ALL SELECT \r\n",
var1, var2, var3, str1.Replace("'", "''"), str2.Replace("'", "''")
);
}

// Subtract the last "union all select"
sql.Length -= 19;

// Insert this batch
SqlCommand cmd = new SqlCommand(sql.ToString(), conn);
cmd.ExecuteNonQuery();
}
} finally {
conn.Close();
}


Insert using XML Parsing


This is a clever trick I've seen that seems to have roughly the same performance as a Union All, but can work with big XML documents.

List<string> values = new List<string>();
for (int i = 0; i < 10000; i++) {
values.Add(String.Format("<var1>{0}</var1><var2>{1}</var2><var3>{2}</var3><str1>{3}</str1><str2>{4}</str2>", var1, var2, var3, str1, str2));
}

string xmlString = XMLUtilities.BuildSQLXmlString("Items", "Item", values.ToArray());

// Create the database connection
SqlConnection conn = new SqlConnection(actual_string);
conn.Open();
SqlCommand cmd = new SqlCommand(@" DECLARE @XML XML
SELECT @XML = @xmlString
DECLARE @Mapping TABLE (
var1 INT,
var2 INT,
var3 INT,
str1 NVARCHAR(255),
str2 NVARCHAR(20)
)

INSERT INTO @Mapping(var1,var2,var3,str1,str2)
SELECT ParamValues.Item.query('var1').value('.','INT')
, ParamValues.Item.query('var2').value('.','INT')
, ParamValues.Item.query('var3').value('.','INT')
, ParamValues.Item.query('str1').value('.','NVARCHAR(255)')
, ParamValues.Item.query('str2').value('.','NVARCHAR(20)')
FROM @XML.nodes('/Items/Item') AS ParamValues(Item)

INSERT INTO test_table(var1,var2,var3,str1,str2)
SELECT m.var1,m.var2,m.var3,m.str1,m.str2
FROM @Mapping m", conn);
cmd.CommandTimeout = 10000;
cmd.Parameters.Add(new SqlParameter("@xmlString", xmlString));

// Iterate through all the items we're going to create
try {
cmd.ExecuteNonQuery();
} finally {
conn.Close();
}


Relative Performance


Bet you're wondering how they stack up! Here are the numbers for a run of 10,000 records:

Insert with Command/Parameter - 64.48 seconds
Insert Stored Procedure - 58.40 seconds
Insert Stored Procedure using BeginTransaction - 16.03 seconds
Insert Union All using SQLParameters - 3.18 seconds
Insert Union All using escaped text - 1.42 seconds
Insert XML - Occasional Crash!


Recommendations


If you're writing general purpose code, just use the plain old stored procedure approach, and use a transaction if you do multiple inserts at a time. It's guaranteed to be reliable, it's guaranteed to be a bit faster than the typical method, and it's all around good practice.

However, if you do need absolute speed, I highly recommend the Union All approach. The best part is, if you switch to a different database, most other RDBMS programs support something similar to Union All so your code will directly convert over!
  |  permalink  |  related link  |   ( 2.9 / 264 )

Next