One of the best ways for me to understand something (relatively) new is to figure out how its like something else, the role it plays as something new, and how to use it. Language INtegrated Query (or LINQ) is a relatively new thing. Let’s step back and look at some existing code that is a ubiquitous staple (see Listing 1).
Listing 1: A basic for-loop that copies a desired subset of items from collection to another.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace CompareLinqToForLoop
{
class Program
{
static void Main(string[] args)
{
int[] integers = {1,2,3,4,5,6,7,8,9};
List<int> results = new List<int>();
for (int k = 0; k < integers.Length; i++)
if (integers[k] % 2 == 0)
results.Add(integers[k]);
Array.ForEach(results.ToArray(), r => Console.WriteLine(r));
Console.ReadLine();
}
}
}
The code is trivial but it demonstrates a basic use of a for-loop. The code starts with an array of integers. A new target object, the generic List is created to receive the desired subset. The for-loop iterates over the source collection of integers, tests each element—and in this case the even integers—and places them in the target List. In a general sense the for loop iterates over items and performs an operation on them. In this specific example the operation is to copy selected items to a new subset, the generic List.
LINQ performs a very similar role to the for loop and subset generator. Instead of specifically creating the target object, defining the for loop, and a separate test, LINQ supports all of these operations in an integrated way. I don’t mean to downplay the value of LINQ or oversimplify it, but if you use this knowledge as a basic starting point—for loop, subset collection generator—then LINQ is pretty approachable. Using LINQ the code in Listing 1 can be re-written as follows (see Listing 2):
Listing 2: Selecting a subset of objects using LINQ.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace CompareLinqToForLoop
{
class Program
{
static void Main(string[] args)
{
int[] integers = {1,2,3,4,5,6,7,8,9};
var results = from i in integers
where i % 2 == 0
select i;
Array.ForEach(results.ToArray(), r => Console.WriteLine(r));
Console.ReadLine();
}
}
}
The from clause is the equivalent of the for-loop. The value right after the from keyword is called the range value. The range value plays the same role as the iterator variable in the for-loop. The range variable is i in Listing 2. The value after the in keyword is the source collection. The where clause plays the role of the test. In Listing 1 an if-conditional is used; in Listing 2 the where clause plays the same role. The where clause in LINQ works and looks like a SQL where clause. Finally, the select clause performs the role of copying the desired target objects. The term often associated with the select clause is projection. The term projection is used when you apply the new keyword in the select clause to define a new type. In the example the select clause is simply accumulating the integers that pass the test in the where clause.
The result of the LINQ query is an IEnumerable<T> object where T in the example is an integer. The result of the LINQ query plays the role of the instantiated List<T> in Listing 1. The var keyword is used to support an unknown, or anonymous, type. You can define the return type on the left side of the equal, but you don’t have to. The reason you want to use var besides that it requires less typing is because LINQ let’s you project a new, as yet undefined, anonymous type.
Anonymous types is where LINQ starts to become really powerful. Suppose you have a class of employee objects and you want a subset of those employee objects, containing just the employee name and phone number, a contact list. Historically, you would have the Employee class, define a new class containing the name and number fields, and then use a for loop to move the data from the source Employee collection to a collection of instances of the new type. With LINQ you don’t have to define the new type—containing name and number. You can define the new type in the select statement with the new keyword, and simply specify just the values for the projected new type. At its essence LINQ is like a for loop. However, because you can define new types in the select clause on the fly you can avoid defining all of those extra classes that are used for lookups, reports, or to answer user queries.
If you are just getting started with LINQ, think: from is a for-loop, where is the test, and select is the operation performed for each iteration of the loop. If you need a new type don’t define the class explicitly define it in the select clause. (For more information on LINQ you can check out my book LINQ Unleashed for C# or look at some of the LINQ examples in my upcoming book Professional DevExpress ASP.NET Controls. (The DevExpress doesn’t emphasize LINQ, but there are some LINQ examples.)