Thoughts on software: software

Showing posts with label software. Show all posts

Thursday, June 08, 2017

[CSS] How are conflicting styles resolved?

If you have worked in CSS, then you’ll know that you can assign a CSS property using the syntax:

property-name: value

For example, if you have a <span> tag with ID ‘content’, for which you want to assign the color green, you’d add this in your CSS file:

#content { color: green; }

There are other ways you can specify the same property:

span#content {color: green;}, and

.content {color: green;} in combination with <span class="content">lorem ipsum</span>

Here's the interview question

What happens though when you have multiple instances of the same property being set & they all apply to the same HTML tag too? Here’s an example:

Consider this tag,

<span id="content" style="color: blue;">some content</span>

while the CSS definition in the associated CSS file that can match the element is:

#content {color: green;}

Since multiple styles match, which one will the browser render? Answer: The text in the span element will be rendered in blue.

Why? Why did the browser decide to apply blue? As per the CSS spec, there are two aspects to be considered when deciding which style a browser will apply among competing styles. Resolving these two aspects tells the browser which competing style should win. They are: 1) Cascading order, & 2) Specificity. We'll first look at Cascading order and later in the post, Specificity.

Cascading order

In English, the term "cascade" is used to describe a process where there are multiple steps. For example, a cascading waterfall is one in which water flows down multiple steps.

If that is the case, what does "Cascading Style Sheets" mean? What steps are there in CSS? It turns out there are multiple ways through which style definitions for a web page can be assigned. They are: author, user & user agent.

Author styles are those which all software developers know - they are created by the authors of the web page as CSS files or style attributes in HTML tags.
User styles are those styles which users of web browsers can configure on their browser. For example, users can configure that browsers render particular fonts by replacing it with other fonts - this is particular useful from an accessibility standpoint.
User agent styles are those styles that are provided by default by the browser. For example, if no colour information is provided, then text is rendered black on a white background by default - this is an example of user agent styling.

The "cascade" in Cascading Style Sheets flows thus: If there are conflicts in property definitions across user, author or user agent style definitions, then the precedence is as follows:

Author > User > User agent

Example 1

In this example, we’re going to determine what happens if a user CSS file has a definition that conflicts with a definition in the user agent's default CSS file. The user agent we’re going to use is Internet Explorer. It already has a user agent CSS file (this is why a plain HTML file without any styling will render black text on a white background.) We will now change the way IE renders text color inside tags by default by providing a user CSS file.
Create a file by name, my_style.css. The content of this file is just this one line:

div{color:red}

We will now tell IE to use this file from now on for all web pages. The way to do so is this:

Open Internet Explorer
Click on the Tools menu & choose Internet Options
Click on the General tab & choose Accessibility. You should get a screen like this:
Under the User style sheet section, enable the Format documents using my style sheet checkbox.
Now click Browse… under the same checkbox and choose my_style.css.
Restart Internet Explorer.

We now need to create an HTML file that we can load into the browser to test that IE uses the my_style.css. Create a file by name, test_my_style.html. The content of this file is:

<html>
<head>
<title>Testing user styles</title>
</head>
<body>
<div>This is a test file to test user styles.</div>
</body>
</html>

Opening this file in Internet Explorer gives us this output:

What happened here? The user agent, by default, will render text inside tags as black-colored text. Our user file, my_style.css, overrode that, thus creating a conflict. IE followed the CSS spec which states that User CSS property definitions have priority over user agent CSS property definitions and rendered the text in red color.

Example 2

What happens if we introduce a further conflict by having an author-defined CSS file? For this, we will create another CSS file, author_style.css, where we will provide the following definition:

div {color:blue}

We will also change test_my_style.html to include author_style.css as follows.

<html>
<head>
<title>Testing user styles</title>
<link href="author_style.css" rel="stylesheet"></link>
</head>
<body>
<div>This is a test file to test user styles.</div>
</body>
</html>

Opening this file in Internet Explorer gives us this output:

What happened here? The user agent, by default, will render text inside tags as black-colored text. Our user file, my_style.css, overrode that, thus creating a conflict. The author’s CSS file, author_style.css, overrode that even further setting up another conflict. IE followed the CSS spec which states that Author CSS property definitions have priority over all other CSS property definitions and rendered the text in blue color.

An exception

The only exception to the cascade order above is if the property definition is marked as !important, in which case that definitions take precedence over other definitions for that property. There are no property definitions marked !important in the user agent CSS file.

Let’s look at an example: We will reuse the same files as before, but we will change my_style.css to this:

div{color:red !important}

Now if we open our test_my_style.html in IE, we get this output:

What happened here? The user agent, by default, will render text inside tags as black-colored text. Our user file, my_style.css, overrode that, thus creating a conflict. The author’s CSS file, author_style.css, overrode that even further setting up another conflict. However, IE noticed the !important in my_style.css and followed the CSS spec which states that User CSS property definitions with !important have priority over all other CSS property definitions and rendered the text in red color.

Specificity

The approach mentioned above will still cause conflicts since one of the user/author stylesheets can have conflicting style definitions. To resolve this, CSS provides another mechanism which browsers can use - specificity. While there isn’t a definition of specificity in the spec, my definition is: Specificity determines how specific the style definition is. Here, specific means how many HTML elements does the CSS selector match - the less elements it matches, the more specific it is, the more elements it matches, the less specific it is.

Calculation of specificity

The calculation of specificity is done in the following manner:

Assume there are four numbers separated by commas, and their initial values are zero:

0,0,0,0

The first number represents the presence of a style attribute in the element's HTML. If a style attribute is present, then the first number becomes 1, otherwise 0.

The second number represents the number of id attributes in the selector.

The third number represents the number of attributes and pseudo-classes in the selector.

The fourth number represents the number of element names and pseudo-elements in the selector.

Unlike in the decimal system, if a number reaches the value 10, then it does not carry over to the preceding number. Thus, specificity values like 0,10,0,9 are perfectly valid.

Now that we know what specificity is, let’s take a look at some example CSS definitions, and try to understand what specificity value they evaluate to:

Example 1: div.content {color:red}. It is not a style attribute in a HTML tag, nor does it have any HTML IDs mentioned in the selector. Thus the first two numbers are 0,0. It has a class attribute value mentioned(.content), and it also has a HTML element mentioned (div). Thus, the final two values of the specificity are 1,1. Hence it's final specificity value is 0,0,1,1.

Example 2: #content::first-letter. It is not a style attribute in a HTML tag, but it has a HTML ID mentioned in the selector. Thus the first two numbers are 0,1. It has a pseudo-element mentioned(::first-letter), and it doesn't have any HTML elements mentioned. Thus, the final two values of the specificity are 0,1. Hence it’s specificity value is 0,1,0,1.

Example 3: div[data-name=Tom][data-url=/member/1]. It is not a style attribute in a HTML tag, nor does it have any HTML IDs mentioned in the selector. Thus the first two numbers are 0,0. It has two attributes mentioned(data-name & data-url), and it has 1 HTML element mentioned (div). Thus, the final two values of the specificity are 2,1. Hence it’s specificity value is 0,0,2,1.

Resolving conflicts with specificity

Given two specificity values, you can compare them to find out which one is greater or lesser. A specificity value is greater than another specificity value if the first specificity’s first number is greater than the second specificity’s first number. In case the first number of both values are the same, then the browser moves on to compare the second number of both specificity values, and so on.

Here are some examples:

1,0,0,0 is greater than 0,10,0,0

0,10,0,0 is greater than 0,0,20,0

How is specificity helpful in resolving conflicts? As per the CSS spec, browsers are supposed to resolve conflicts by choosing those CSS definitions that have a higher specificity.

Example

Let’s take the example in the interview question above:

In the HTML, we have:

<span id="content" style="color: blue;">some content</span>

while in the CSS file, we have:

#content {color: green;}

Constructing the specificity for the style definition in the HTML style attribute, we get:

1,0,0,0

Constructing the specificity for the CSS style definition, we get:

0,1,0,0

Because the first specificity value is greater than the second, the style definition in the style attribute of the HTML tag wins.

Is it possible to still have conflicts?

Yes. For example, there could be two definitions in an author CSS file which target the same elements and have the same specificity. In such cases, the CSS spec says that browsers can use the definition that appears later.

An example:

Let’s say that we have two CSS definitions as below:

div {color:blue};

div {color:red};

for this HTML,

<html>
<head>
<title>Testing user styles</title>
</head>
<body>
<div>This is a test file to test user styles.</div>
</body>
</html>

Both CSS definitions evaluate to a value of 0,0,0,1.

In this case, the browser will simply render the text in red.

Saturday, December 19, 2015

Why a mixed format is not recommended

While pairing with developers, I have often noticed that they have a tendency to periodically do a mixed format.

What is a mixed format?

Now I have no idea whether this is the official term, but here is what I mean when I say, “mixed format”. A mixed format is when a developer, working on some code, comes across some other code that is not formatted as per the project’s conventions. This code could span a few lines, or in worse cases, a whole file. The developer immediately invokes his editor’s format command, and formats the offending lines, or the whole file. With a satisfied smile on his face, the developer moves on to complete whatever work he was originally tasked to do. He then creates a commit that includes:

the work he was originally tasked to do, and
the formatting that he set right.

What’s wrong here?

Now, from the point of view of clean code and team work, formatting is not wrong. However, I do not recommend crafting a commit that mixes both format changes and logic changes, when the following conditions hold true:

The format changes are not related to the actual lines that the logic change encompasses
The format changes are more than logic changes

Why? Consider what happens when the developer goes ahead and checks in his code to the VCS. Other developers reviewing his commit immediately notice that the commit’s code changes are too many - this results in an impression forming in the reviewer's mind which can range between “Wow, this is a large commit. I need to go line by line” to a feeling of just giving up. With inexperienced or bored developers, it is usually the latter.

Also consider what happens when sometime in the future, a developer realizes that your commit introduced a line that causes a bug. In order to ensure a clean fix, he opens your commit with the intention of understanding what you intended to fix. And he arrives at the same realisation - your code changes are too many. Without any choice, he is forced to go through each line to understand what it does. Imagine his frustration when most lines turn out to be formatting changes, and hidden among the formatting changes is the actual change he’s looking for.

The lesson here is to avoid large formatting changes mixed with logic changes. Prefer to stick to formatting only those lines where your feature/bug also demands a change. If you can’t avoid this, then make two commits - one for the feature/bug changes, the other just for formatting changes.

This is only a recommendation, not a rule

As soon as you read this, please don’t fire up the comments editor or your blog editor to write a comment/blog about why I am wrong. I understand this is basically a Considered Harmful essay, and I know that Considered Harmful essays are considered harmful. With that in mind, I’ll only say that the above is a recommendation, not a rule. When making such a commit, please do think about how a future you would feel if you came across such a commit, and how you’d react.

Sunday, December 21, 2014

Git: What are diffs and hunks?

When I was learning Git for the first time many years ago, one of the features that made me go, "Wow!! That's something I have really wanted all these years!" was the ability to choose which changes to commit among all the changes in a given file. I hadn’t seen this in the other version control systems I’d used, which were CVS and SVN.

Here’s an example of what I am trying to illustrate. Suppose I have a file named Employee.java with the following contents,

class Employee {
     private String firstName;
     private String lastName;

     Employee(String firstName, String lastName) {
          this.firstName = firstName;
          this.lastName = lastName;
     }

     public void equals(Employee e) {
          if !(e instanceof Employee)
               return false;
          return e.firstName.equals(this.firstName) && e.lastName.equals(this.lastName);
     }
}

Ignore the fact that there's no hashCode() implementation, please!!

You decide to add more functionality to Employee.java, namely, a grade instance variable and a toString() method that prints out who the employee is and what he does. Employee.java now looks like this:

class Employee {

     private String firstName;
     private String lastName;
     private String grade;

     Employee(String firstName, String lastName, String grade) {
          this.firstName = firstName;
          this.lastName = lastName;
          this.grade = grade;
     }

     public void equals(Employee e) {
          if !(e instanceof Employee)
               return false;
          return e.firstName.equals(this.firstName) && e.lastName.equals(this.lastName);
     }

     public void toString() {
          return “I am “ + this.firstName + “ “ + this.lastName + “, working as “ + this.grade;
     }
}

Ignore the fact that grade is not part of equals(), please!!

When you do a git diff on Employee.java, this is what you get:

When you do a git add at this point, all the newly introduced code will be ready for commit. Let’s say you want to add the toString() function as a separate commit. In other VCSs, that's not simple. You will have to maintain two copies of Employee.java, with one copy introducing the grade variable, and another copy introducing toString(). This is cumbersome, but in Git, is very easy. You just do

git add -p

which allows you to choose what pieces of code change to commit. For the above example, doing git add -p would give you

At this point, keying in 'y' will add this to the index, after which the next piece of code change is shown.

and so on…

When I learnt this, I thought, "All that’s fine, but what is the word ‘hunk’ doing there in “Stage this hunk?"? What does it mean anyway?”

To know what’s a hunk, you’ll have to know more about the output of the diff command. Note that we are not talking about git diff, but just diff.

Understanding the diff command

diff is the Linux command to generate a report that documents the differences between two files. According to Wikipedia, given two files, a and b, with b being an updated version of a, then diff basically reports what changes should be done on a to make it b.

The report that diff generates can be in 3 forms. They are: a) Edit script, b) Context format, or c) Unified format. With git diff, we get the Unified format.

The unified format, explained in short, goes like this:

The entire output of diff is called ‘diff’. That’s why people often say, “Send me the diff”. They are actually asking for the output of the diff command.

A diff begins with two lines that indicate the two files being compared. The first line begins with ‘---’ and indicates the original file, while the second line begins with ‘+++’ and indicates the newer file. Line additions are preceded with a ‘+’ symbol, while line deletions are preceded with a ‘-’ symbol. Line modifications are represented as a combination of line deletion and addition.

Now, when a change occurs to a file, the change can be: a) in only one line, b) in consecutive lines, or c) in lines spread all over the file.

Thus, the receiver of a diff would like to know which line numbers in the original unchanged file were changed. Hence, it is enough if the output of diff includes a special line that indicates the starting line position of the change, as well as the destination line position, followed by the actual changes. The destination line position is included since earlier changes in the same diff could have pushed the original line further down the file.

However, (especially in open-source projects), it is possible that two changes are applied to a file by two separate users at the same line. When integrating these two changes, it is not useful if you only have the line numbers. You also need to provide some context, by which we mean some lines before and after the changed line. This is useful when applying conflicting changes like the one above, as we can use it to determine how the second change should fit in on the first change.

The unified format handles both by providing context around the changed line, and also providing a special line that indicates where in the file, the first line of context starts, and how many lines of context are provided. To indicate that these lines are special lines that are only for the receiver’s understanding and are not part of the diff, the Unified format surrounds such special lines with ‘@@‘ symbols. Such lines are called range information lines. The format of a range information line is:

@@ -<<starting line number of context in original file,number of lines of context from original file>> +<<starting line number of context in modified file,number of lines of context from modified file>> @@

Understanding Employee.java diff

This should now help us understand the output of git diff that we did on Employee.java earlier. Let’s take a look at it again:

The first two lines that you see,

diff -- git a/Employee.java b/Employee.java
index b2ea747..cbdaf9e 100644

are generated by Git. Beyond this is the actual diff output. So let's ignore this and move onto the diff.

The first two lines in the diff,

--- a/Employee.java
+++ b/Employee.java

are the two files that diff is trying to compare. Employee.java is prefixed with ‘a/’ and ‘b/’ in the two lines because Git is comparing your copy of Employee.java with the copy in HEAD. Git tries to represent these two versions of Employee.java as being in two folders ‘a/’ and ‘b/’, just as a way of differentiating them. In reality, if you had used just diff, you would have provided two files physically present on the filesystem.

The first range information line is:

@@ -1,6 +1,7 @@

In the range information line, the “-1,6” indicates that the original file’s context provided starts from the first line of the file, and 6 lines of context are provided. The “+1,7” indicates that the new file’s context provided starts from the first line of the file, and 7 lines of context are provided. Why 7? Because of the addition of the grade variable, that is only present in the new file.

The second grade information line is:

@@ -12,5 +13,9 @@ class Employee {

In this range information line, the “-12,5” indicates that the original file’s context provided starts from the 12th line of the file, and 5 lines of context are provided. The “+13,9” indicates that the new file’s context provided starts from the 13th line of the file, and 9 lines of context are provided. Why is the starting line position in the new file 13? Because of the addition of the grade variable previously. Why 9 lines of context? Because of the addition of the toString() method in the new context.

So what’s a hunk?

Now that you’ve understood the diff output, it becomes easy to understand hunks. Hunks are simply the term for the combination of a range information line followed by the change information until the next range information line.

Thursday, April 15, 2010

Code

I have been coding professionally for the past 5 years, and coding small projects at home, but only in the last year have I realized the benefits of having a public repository of my code. Having heard a lot about jQuery, Struts 2 and Ruby and having wanted to learn them, I started a small Minesweeper project on the side to learn these technologies. I also hosted the app on Google Code.

The app is hosted here. I won't say I am very great at coding (though I am following various ways to improve) and hence praise and criticism of my code is welcome.

Right now, it uses Struts 2 and jQuery. I have not yet started on the Ruby/RoR part. Hopefully, that day should come soon...

UPDATE (20th Aug 2012): The Java app is feature complete long ago (sometime in 2010 itself). Work continues on the Ruby app along with my other side projects, apart from official work and life.

Friday, October 31, 2008

...but do products have domains?

This is with reference to my previous post, Do code generators have domains?

My IT career has spanned only 4 years, and I have worked for only 2 companies during this period, with the first being a software product company and the second a software services company. Since the work in the first company was developing a code-generator, I never heard the word 'domain' being uttered all through my time there. I heard it first only in the software services co.

Very naturally, I assumed 'domain' was a word that was used only in the software services industry. "Products can never have domains!!", my brain told me. "How could they? You saw it for yourself... company 1 never uttered that word!! Company 2 keeps saying it all the time!!"

Well, that was what I believed for a long time.. but my brain, being ever active ;-), seems to constantly verify whatever it believes against the real world. And very soon, it came up with exceptions!!

Products do have domains. The first example that strikes me when you talk to me about software products would be Microsoft Word. But that's not a very good example in this context, since almost every industry uses Word. I fail to think of one industry that doesn't use Word.

A good example would actually be something like the accounting package, Tally. Tally is a software product that is probably unheard of to people in the retail industry, while accountants using Tally are probably not much aware that there is a software called RayMedi RPOS in the retail industry.

This lack of awareness is because the tools target different domains. They are marketed only to people in the industry they target. Nobody's gonna market RayMedi RPOS to a company that provides accounting services. These products cater to the needs of specific industries (domains) and the people in those industries have probably never heard of those products that fall outside their industry.

Tally and RayMedi RPOS are examples of software products. They also target only certain domains. As such, they are examples of software products that do have domains.

Wednesday, May 21, 2008

You need to think like the user

In software development, it is necessary to think of all the actions that the user would perform with your software. This is so that you can fix any bugs or inadvertent situations that might arise before it goes over to the user.

Software professionals refer to this as 'thinking-like-the-user'. Like Joel Spolsky says in this article, you need to create imaginary users in your brain and think of them exercising your software. This usually reveals scenarios that you might otherwise have never thought of. Personally, I have always striven to think like the user in all situations, whether I am writing a full-fledged application or just changing a block of code.

For the past few months, we have been maintaining some code written many years back. Recently, I found one 'bug' that illustrates what happens when you fail to think like the user. I have not taken screenshots of the application, but the screenshots you see below are similar.

We have a screen where the user can search for groups.

Now the Find Groups button lists all groups if the user does not enter anything in the text box. However, if he does enter something, then clicking the button lists all groups starting with that text.

Usually, I would just press the Enter button and move ahead. On this particular day, I used the mouse to click on the button.

And I got this...

I analyzed the code to see what the developer had done. Here it was:

<script>
function validateIt() {
//If the value entered is of zero length, then show an alert message.
}
</script>

Then a few lines down....

<html:form action="some.do">
<html:submit value = "Find Group" onclick="javascript:validateIt();"/>
</html:form>

So basically the developer had written a script to validate the text entered by the user in the field. This would fire when the user clicked the button. But the control which fires this event is a submit button, which can be activated by pressing the Enter key. However doing so does not fire this script - for that to happen, you would have to write an onsubmit event handler.

So now you have a case where clicking on a button to submit the form is considered erroneous, while pressing the Enter key to do the same thing isn't.

This thankfully was an internal application - people probably didn't care that much about quality. The application has more of these annoyances, but the users probably grind their teeth and worked away, since they needed this application for their daily work and also because they had no other choice.

What would have happened if this was part of a product? And in a market where you have competitors?

Sunday, September 30, 2007

Does onsite travel mean only the US?

I am a software developer working in India. (OK, you got that from my blog's heading, but I just thought I'd repeat it). I am now working for my second company. While in the first company, I went on an on-site visit to Delhi, where our client was. I stayed there for a month, helping the client out as he faced problems with our product. It was a kind-of great experience for me, as I got to know for the first time the thought processes people had and problems they faced as they used our product.

Well, a year and a half later, I was job-hunting, and went to various interviews. Now, on-site experience is considered very valuable in the Indian software industry, and I was pretty sure that people would respect me for the experience I gained. In one particular interview, I mentioned that I had gone on-site. The interviewer asked, "Where?" and I said, "Delhi".

He said, "That's not on-site." I said, "Yeah, but that's where our client is..."

The interviewer nodded, but I could see he didn't believe it. He didn't believe in the experience I had gained there. He didn't consider Delhi as on-site.

I joined the very same company whose interviewer asked me that question. With other work, this incident was pushed to the back of my mind. Some days back, it re-surfaced. After lunch, I and a few of my friends working in the same company were walking towards our building, when for some reason, I mentioned the incident. One of my friends immediately hotly defended the interviewer; surely Delhi could never be considered on-site!!

I got angry; I took it kind-of personally - well, he was after all, saying that my on-site experience at Delhi was not to be considered. I got puffed up and ready to argue, but my friend said he had to pick up cash at the ATM and walked away.

Later, when I was at home and in a calm mood, I thought this over, finally. I realized
that for some reason, my company (I am not sure about other companies in India, but I think they are also the same) seems to consider only US travel as on-site. I feel this is ridiculous.

Why should I feel so? Let me put forth my reasons. Let's start by answering this question:

Why is on-site experience valued?

Let me provide the answer too: On-site experience is valued because for the first time, you are face-to-face with the customer. While at offshore, you can easily say that this-bug-cannot-be-fixed/I-cannot-come-on-Saturday-to-fix-that-bug and such stuff. But you cannot say that in front of the customer, because if you do, the customer then stares at you in anger. And I tell you - that stare pierces your heart, that stare gives you guilt feelings, that stare gives you cold sweats - your company, rather YOU have just lost a customer. The customer has just taken one step down the road to never recommending you and your company to others.

Lost. That very word makes you sweat. That very word, that very stare, ensures that even after you go home, you keep thinking about it. The customer's face, after you finished speaking, is what comes into your head, and you cannot shake it away, for some reason, which you don't know.

That, in my opinion, is why on-site experience is so valued. You face the customer. Not everybody can do that. And when you return, after having successfully moved your application to production, and after having been given a personal send-off by your all-smiling customer, you return to two things - 1) the knowledge and the satisfaction that you have just retained a customer, and 2) the applause of all your colleagues. Soon, you find that everybody in your company listens to you all the more. Its not that they weren't listening before; its just that they listen to you all the more.

It is for this lesson that on-site experience is so valued. Now the question is, where can you get this experience? Only in the US? I say, no!! Customers are spread throughout the world, and wherever your customer is, you can gain this experience. He may be in Delhi or in San Francisco, but whatever it is, on-site is valued for customer relationships, not for US travel.

And that's why I expect people to respect me and my experience when I say I travelled on-site and solved my customer's problems!! It might be Delhi, but when my application didn’t work the way the customer wanted it, he raised his voice, and said, “What application is this, yaar?” And that’s it – it sends me into a flurry. I immediately note it down, and when I return, include the feature into the application.

On-site is valued for customer relationships, not for US travel.

What are your views on this? Am I wrong here? Is there something I don't seem to understand? I would love to hear any opposing views, so feel free to comment on this post or mail me regarding this.

Thoughts on software