Friday, August 14, 2015

Crawling HTML with CsQuery

Recently I was asked to build a crawler for a webpage.


The crawler was supposed to get part of the main page values like the TV Show name, it’s season etc.

So how do you do it?

First you have to download the page’s html to your server:
using (var client = new WebClient())
 {
     htmlContent  =            client.DownloadString(link);
 }
Now you have the whole document as a string. In order to get the relevant values you have to identify the path to the element that has the desired value. You’ll want to find an element with an “Id” attribute so you can be sure that it is unique and set it as the root of your path. From that element you’ll have to travel the DOM until you get to the wanted element and its value.

For example: In order to get the TV show title in the above website I’ll Inspect the element from the browser:






















The TV Show name is the innerText of the “a” element. The first unique element with an “Id” attribute is the <div id=”main”> (there’s another element that doesn’t have an id attribute but still seems kind of unique – <div class=”subpage_title_block”> both can be used).
After we identified the root element we need to explicitly describe the whole path:

div (id=main) => div => div => div (notice that the former element has an “a” element as a first child) > h3 => a. The last element has the TV Show title as its innerText.

The fun part

So how can we create such a path while the DOM is represented as a single string?
There are several solutions for crawling an HTML string: the most common is the HTMLAgilityPack which allows to perform lync style operation on the DOM. It’s nice but not simple enough to use.

There is another crawling solution - CsQuery (“Install-Package CsQuery” from nuget). It’s API is so neat and straight forward - just the way you would want it:
CQ dom = "<div>Hello world! <b>I am feeling bold!</b> What about <b>you?</b></div>";

In order to create a new CQ instance all you have to do is just serve the html string.
The selection is really simple too:
var boldElements = dom["b"].Select(x => x.InnerText).ToList();

Here you select all the bold text from the DOM.
So how does CsQuery help us in our example?
var tvShowTitle = dom["div#"main > div > div > div> h3 > a].InnerText;

Yes! That simple!


Enjoy.

Tuesday, June 30, 2015

Domain Driven Design, CQRS & Event Sourcing

On July 12th I'm going to give a talk at the annual IDF .NET conference which is the 3rd time that I'm one of it organizers.
This time I'm going to talk about three major patterns - Domain Driven Design, CQRS and Event Sourcing. 
I believe that the young software developers ought to know design patterns, best practices and the big things that are happening in the industry. All those patterns are becoming more and more popular for a reason. I think that most of the applications which are developed in the IDF could use some or all of them.

Here are the slides - http://slides.com/dennisnerush/cqrs#/



Wish me luck!

The video of the lecture can be found here.

Tuesday, June 16, 2015

Creating Nuget Packages using Visual Studio 2015

If you have ever created a nuget package you know how annoying it is. Using the nuget command line to pack the project and then create the .nuspec file and then build it for the desired frameworks.
Visual Studio 2015 (currently RC) lets you create nuget packages extremely easy!
First create a new Class Library project:
clip_image002
Notice that it is under ‘Web’ category.
clip_image004
Now we’ll enable the nugget package creation just as written in the comments above.
clip_image006
I just love their UX in this window.
Now we’ll need to fill the package details in the project.json file.
clip_image007
The project.json is almost the same as the .nuspec file. We can include the package dependencies and the target frameworks.
Now if we compile the project and navigate to the solution directory we’ll see the nuget package in the artifacts folder.
clip_image008
clip_image009
Good luck Smile














Friday, June 12, 2015

Aligning Elements using CSS3 “Flex” options

CSS3 introduces the “Flex” options that makes aligning elements extremely easy! It is currently supported in all the major browsers (not supported in IE 11 and below);

Let’s say that I want to align four elements in a line:

clip_image002[4]

The CSS:

.container {
padding
: 0;
margin
: 0;
list-style
: none;
display
: flex;
}

.item
{
background
: green;
padding
: 5px;
width
: 150px;
height
: 150px;
margin
: 10px;
line-height
: 150px;
color
: white;
font-weight
: bold;
font-size
: 3em;
text-align
: center;
-ms-border-radius
: 30%;
border-radius
: 30%;
}


The only new property here is “display: flex”. Not only that flex aligns the elements in a single line it also makes them responsive when the container size changes (overriding the element’s explicit width property;

clip_image003[5]

If you wish to keep the element’s original width you can add the flex-wrap:wrap property which will break the line if needed:

clip_image004[5]

You can reverse the order of the elements that break the line using flex-wrap:wrap-reverse:

clip_image005[5]

You can align the whole line to center or to the first/last item (flex-start/flex-end) using the justify-content property. By default it is set to flex-start:

clip_image007[5]


.container {
padding
: 0;
margin
: 0;
list-style
: none;
display
: flex;
flex-wrap
: wrap;
justify-content
: center;
}


There are two more options for justify-content: space-between and space-around which adds equal spaces between each element:

clip_image009[9]


.container {
padding
: 0;
margin
: 0;
list-style
: none;
display
: flex;
flex-wrap
: wrap;
justify-content
: space-between;
}


If you choose space-around then the first and the last elements will also have the same spaces from the borders.

Vertical Alignment

You can vertically align the elements using flex-direction:column.

clip_image010[4]


Using align-items you can control the vertical starting point (the same like justify-content):

clip_image011[4]


.container {
padding
: 0;
margin
: 0;
list-style
: none;
display
: flex;
flex-wrap
: wrap;
align-items
: center;
flex-direction
: column-reverse;
}


So far we’ve applied the flex properties only to the elements container. We can also apply some flex properties directly on the elements, like making some item be 3 times bigger than the rest:

clip_image013[4]


.container > .grow {
flex-grow
: 3;
}


I’m using the flex-grow property on the wanted item and setting it’s value to 3 as for 3 times bigger.

Flex has more properties that can help you easily align your elements . Go ahead and try!

Wednesday, May 20, 2015

Chrome Developer Tools Tips and Tricks

In the last ALT.NET meetup @Shay Friedman talked about “Chrome Developer Tools Tips & Tricks”. All the audience, myself included were amazed from the features that Shay has shown that neither of us knew.

The Tips & Tricks:

1. Functions navigation in JavaScript source file:

You’re all familiar with this window:

clip_image002

Some of you might know that if you hit Ctrl + P you can open a JS source file from your workspace. However did you know that if you hit Ctrl + Shift + P you can navigate to a function which is included in the opened file!

clip_image004

No more endless scrolling!

2. Style search

This window is also pretty common.

clip_image006

We scroll down in order to figure out which Style messes our element’s css. However did you notice that you can Search (!!!) for a style property or value???

clip_image007

clip_image009

3. Color picker

This is a color property.

clip_image011

Usually in order to change a color you will double-click the area and manually write the new color. However if you click on the colored box itself you will get a color picker dialog!

clip_image013

And if you move the cursor over the page you could even pick a color!!!

clip_image015

4. Disable Cache and reload cache

You remember that time tried your best to hit F5 to refresh the page but had no luck? And then you realized that your cache wasn’t disabled. First and foremost disable it. However if for some reason you still think that your changes are not loaded to the browser you can right click the refresh icon (while the developer tools is open) and try the Hard Reload or the Empty cache and Hard reload it. That should do it.

clip_image017

5. Filtering the network files

If you click on the filter icon in the network tab you could filter all the request’s files by a wanted file type. Hold Ctrl and you can multi select the file types

clip_image019

6. Black box script in Call Stack

When you debug your app and hit a break point you’ll sometimes want to check the call stack window to see previous calls.

clip_image021

Since I use angular this call stack has a lot of angular’s calls which I couldn’t care less about them. However, if I right-click on the angular script file I can select it as a Blackbox script.

clip_image022

And now I won’t see its calls in the call stack window J

clip_image023

If you don’t know some of those features you don’t need to feel embarrassed. Google uses tons of UX best practices in all its products in order to reveal all the available features, but not in Chrome. Here I think they actually tried to hide some of those features. J

If you have more tricks and tips then feel free to comment

Thursday, April 9, 2015

Improving Angular.js performance using One Time Binding

In the previous post I explained how the $digest loop works and how to increase performance using the ng-model-options directive. In this post we’ll take a look at ‘one time binding’ feature.

As I mentioned before the $digest loop goes through all the registered watches every time in order to check whether they need to be updated. However not all our scope variables and models should have binding or be updated after they have been initialized, for example: a logo path, a title, names, some constant options that we used ng-repeat for them and so on. All those variables should be initialized and not to be updated until a full page refresh.

One time binding is a feature that was introduced along with ng-model-options in Angular 1.3. In order to set that a scope variable should not be updated all you have to do is add :: before the variable name:

<input type="text" ng-model="name" value="text" />
{{::name}}
{{name}}


Adding it before the scope variable tells Angular not to register a watch on the variable and so decreases the amount of watches for the $digest loop to check which leads to increase in the app’s performance.

Enjoy