C# course

Lecture 10

LINQ part 1

Agenda

  • explanation, motivation
  • LINQ & collections (needs of LINQ while working with collections)
  • extension methods in LINQ context
  • syntax forms
  • LINQ to Objects
  • understanding var and IEnumerable in LINQ context
  • functional programming and lambda in context

What is LINQ?

  • stands for “Language INtegrated Query”
  • is a set of features that extends powerful query capabilities to the language syntax of C# and Visual Basic
  • introduces standard, easily-learned patterns for querying and updating data

What is LINQ?

  • the technology can be extended to support potentially any kind of data store.
  • Visual Studio includes LINQ provider assemblies that enable the use of LINQ with
    • .NET Framework collections
    • SQL Server databases
    • ADO.NET Datasets
    • XML documents

Motivation

LINQ to Objects

Suppose you have a data source:

1: 
int[] source = { 0, -5, 12, -54, 5, -67, 3, 6 };
you need **to find** positive integers:
1: 
{ 12, 5, 3, 6 };
and **to sort** them from larger to smaller
1: 
{ 12, 6, 5, 3 };

C# 2.0:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
List<int> results = new List<int>();
foreach (var i in source)
{
    if (i > 0)
    {
        results.Add(i);
    }
}

results.Sort((x1, x2) => x2 - x1);

Syntax forms

Fluent or query methods sytax:
1: 
2: 
var results = source.Where(i => i > 0)
                    .OrderByDescending(i => i);
Query expression syntax:
1: 
2: 
3: 
4: 
var results = from i in source
              where i > 0
              orderby i descending
              select i;

LINQ to SQL

We can query SQL database without changing query code

fluently
1: 
2: 
3: 
4: 
5: 
6: 
7: 
database.Products
        .Where(p => p.Category.CategoryName == "Beverages")
        .Select(p => new Product
        {
            ProductName = p.ProductName,
            UnitPrice = p.UnitPrice
        });
or with an expression
1: 
2: 
3: 
4: 
5: 
6: 
7: 
from product in database.Products
where product.Category.CategoryName == "Beverages"
select new Product
{
    ProductName = product.ProductName,
    UnitPrice = product.UnitPrice
};

One way to query different data sources

LINQ to Wikipedia
1: 
2: 
3: 
4: 
wiki.Query.categorymembers()
    .Where(c => c.title == "Category:Mammals of Indonesia")
    .Select(c => c.title)
    .ToEnumerable();

OR

1: 
2: 
3: 
4: 
(from cm in wiki.Query.categorymembers()
 where cm.title == "Category:Mammals of Indonesia"
 select cm.title)
.ToEnumerable();

Infrastructure of LINQ

Infrastructure of LINQ

LINQ providers

LINQ to Amazon

LINQ to Indexes

LINQ To Geo

LINQ to LDAP

LINQ to Excel

LINQ to Active Directory

LINQ to LLBLGen Pro

LINQ to Expressions (MetaLinq)

LINQ to Lucene

LINQ to Facebook

LINQ to JSON

LINQ Extender

LINQ to Metaweb(freebase)

LINQ to Flickr

LINQ over C# project

LINQ to NHibernate

LINQ to SimpleDB

LINQ to Streams

LINQ to CRM

LINQ to JavaScript

LINQ to XtraGrid

LINQ to MySQL, Oracle and PostgreSql (DbLinq)

LINQ to Opf3

LINQ to WMI

LINQ to WebQueries

LINQ to Sharepoint

LINQ to Google

LINQ to Parallel (PLINQ)

LINQ to RDF Files

Key benefits

  • Independent to data source
  • Strong typing
  • Query compilation
  • Deferred execution

What makes LINQ the way it is?

  • Automatic Property
  • Object Initializer And Collection Initializer
  • Type Inference
  • Anonymous Type
  • Extension Method
  • Lambda Expression
  • Query Expression

LINQ and collections

LINQ and collections

Extension methods in LINQ context

How horrible it looks without extension methods:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
IEnumerable<string> query =
  Enumerable.Select(
    Enumerable.OrderBy(
      Enumerable.Where(
        names, n => n.Contains("a")
      ), n => n.Length
    ), n => n.ToUpper()
  );
But with them:
1: 
2: 
3: 
IEnumerable<string> query = names.Where(n => n.Contains("a"))
                                 .OrderBy(n => n.Length)
                                 .Select(n => n.ToUpper());

Extension methods in LINQ context

It could be even readable like this:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
var period = 8.October(2015).To(DateTime.Today)
                            .Step(1.Days())
                            .Select(d => d.Date);

foreach (DateTime day in period)
{
    Console.WriteLine(day);
}

FIDDLE

Query expression vs fluent syntax

Fluent syntax is shorter with simple where:

1: 
var adults = people.Where(person => person.Age >= 18);

vs

1: 
2: 
3: 
var adults = from person in people
             where person.Age >= 18
             select person;

Query expression vs fluent syntax

Query expressions shine with joins:

1: 
2: 
3: 
4: 
from defect in SampleData.AllDefects
join subscription in SampleData.AllSubscriptions
  on defect.Project equals subscription.Project
select new { defect.Summary, subscription.EmailAddress };

vs

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
SampleData.AllDefects.Join(SampleData.AllSubscriptions, 
                defect => defect.Project,
                subscription => subscription.Project,
                (defect, subscription) => new
                {
                    defect.Summary,
                    subscription.EmailAddress
                });

Query expression vs fluent syntax

And ordering:

1: 
orderby item.Rating descending, item.Price, item.Name

instead of:

1: 
2: 
3: 
.OrderByDescending(item => item.Rating)
.ThenBy(item => item.Price)
.ThenBy(item => item.Name);

Query expression vs fluent syntax

But bear in mind that you cannot write everything in query expressions

1: 
2: 
3: 
4: 
5: 
6: 
(from product in SampleData.AllProducts
 where product.Category.CategoryName == "Beverages"
 orderby product.ProductName
 select product) // Query expression cannot do pagination.
.Skip(50)       // So it has to be mixed with query methods.
.Take(10);

Query expression == fluent sytax

From compiler perspective query expressions are just syntactic sugar and it always translates them into method calls

For example following code

1: 
2: 
3: 
from person in people
where person.Age >= 18
select person;
will be translated to
1: 
var adults = people.Where(person => person.Age >= 18);

It is the same as preprocessing (if it's familiar to you)

IEnumerable helps to LINQ to rock and roll

LINQ query can produce one of two results:

enumeration

1: 
2: 
3: 
IEnumerable<int> res = from s in sequance
                       where s > 3
                       select s;

scalar (statistic)

1: 
2: 
3: 
int res = (from s in sequance
          where s > 3
          select s).Count();

LINQ to Objects

Subset of LINQ which - is executed in memory - with any .NET collection which implements IEnumerable interface - without any intermediate provider such as LINQ to SQL or LINQ to XML

Why use VAR in LINQ context

Always use VAR storing result of query because...

Type of result can be a little bit clumsy:
1: 
IEnumerable<IGrouping<string, Person>> result = people.GroupBy(n => n.Name);

vs

1: 
var result = people.GroupBy(n => n.Name);

Why use VAR in LINQ context

It can be an enumeration of anonymous types:

1: 
2: 
var result = from person in people
             select new { person.Name, person.Surname };

Why use VAR in LINQ context

You never know how you will change your LINQ query therefore VAR will help you to not change resulting type with a small change:

for example query was
1: 
2: 
var result = from person in people
             select new Person { Name = person.Name, Surname = person.Surname };
But someone decided to group it
1: 
2: 
var result = from person in people
             group person by person.Name;

What is Lambda (λ-calculus)?

Lambda calculus is a formal system to use functions and function application to express computation In simple words it says that any computation can be built by applying simple function which may compose complex (high-order) functions

It was introduced in 1930s by Alonzo Church, the doctoral advisor of Alan Turing

Lambda in C# context

Is fancy feature introduced in C# 3.0:

1: 
2: 
3: 
4: 
5: 
MyDel del = delegate(int x)    { return x + 1; } ;     // Anonymous method 
MyDel le1 =         (int x) => { return x + 1; } ;     // Lambda expression 
MyDel le2 =             (x) => { return x + 1; } ;     // Lambda expression 
MyDel le3 =              x  => { return x + 1; } ;     // Lambda expression 
MyDel le4 =              x  =>          x + 1    ;     // Lambda expression

Functional programming in LINQ context

Main paradigms of FP: - Closure - Currying - Memoization

Closure

Using variables from scope out of the func:

1: 
2: 
int age = 20;
Func<int, int> getOlderOn = x => age + x;

In LINQ context:

1: 
2: 
3: 
4: 
5: 
var filter = "Compare";

var query = from m in typeof(String).GetMethods()
            where m.Name.Contains(filter)
            select new { m.Name, ParameterCount = m.GetParameters().Length };

FIDDLE

Currying

Create functions that create other functions by adding arguments one by one

1: 
2: 
3: 
4: 
5: 
var grep = Curry<Regex, IEnumerable<string>, IEnumerable<string>>(
                (regex, list) => from s in list
                                 where regex.Match(s).Success
                                 select s);
var grepFoo = grep(new Regex("foo"));

FIDDLE

Memoization

Simply cache results of functions:

1: 
2: 
3: 
4: 
Func<uint, uint> fib = null;
fib = x => x > 1 ? fib(x - 1) + fib(x - 2) : x;

fib = fib.Memoize();

FIDDLE

Main benefits of FP:

  • Composability
  • Lazy evaluation
  • Immutability
  • Parallelizable
  • Declarative

Composability

Ability to compose complex things with a banch of simple ones.

1: 
2: 
3: 
var results = source.Where(item => item > 0 && item < 10)
                    .OrderBy(item => item)
                    .Select(item => item.ToString(CultureInfo.InvariantCulture));

Lazy evaluation

The query is not evaluated until you iterate it.

1: 
2: 
3: 
4: 
5: 
6: 
var result = from person in people
             where person.Age > 21
             select new { person.Name, person.Surname };
//at this moment result is not calculated

var realResult = result.ToList();

Immutability

The result of any operation is new value

1: 
2: 
3: 
var results = source.Where(item => item > 0 && item < 10) //new enumeration
                    .OrderBy(item => item)                //new enumeration
                    .Select(item => item.ToString());     //new enumeration

Parallelizable

Since everything is immutable - it easier to parallelize

1: 
2: 
3: 
4: 
5: 
Enumerable.Range(1, 10000)
          .AsParallel()
          .AsOrdered()
          .Where(IsPrimeNumber)
          .ToList()

Declarative

Allows to write more expressive code

1: 
2: 
3: 
4: 
5: 
6: 
7: 
from d in ds.Doctors
join c in ds.Calls
on d.Initials equals c.Initials
where c.DateOfCall >= new DateTime(2015, 10, 1) &&
      c.DateOfCall <= new DateTime(2015, 10, 31)
group c by d.Initials into g
select g;

Nice article