There have been several polarizing topics throughout history, such as religion & political affiliation. And for software developers there's one more biggie ... tabs -vs- spaces! Which group is right? Perhaps the opinion of the better programmers should have more weight(?) Is there a metric we can use to determine whether one group of programmers is better than the other? Assuming better programmers are paid more, perhaps we can make the comparison based on salary ...
Stackoverflow recently published the results of their annual Developer Survey, and they wrote an interesting article that seems to indicate that coders who use spaces make more money than those who use tabs. Their main graph seemed simple and straightforward, and I kinda liked it at first glance.
But upon closer examination, I started to question their graph. For example, the title indicates that it was based on 12,426 of the survey respondents - and while it is true that 12,426 (of the 51,392 total) respondents claimed to be professional developers, and indicated whether they used tabs or spaces, and provided an annual salary number ... I'm wondering how the graph handled the 53 who did not specify the number of years they coded for a living? (Are they somehow included in the graph, or does the graph not really represent 12,426 respondents?) And being an international survey, many of the respondents are not paid in US dollars - did they convert all the other currencies to US dollars (what exchange rates did they use)? Is it statistically valid to combine programmers from countries with vastly different pay scales (for example, India and China probably pay their programmers a lot less than in the US and Europe) . And the 859 who left the 'currency' field blank - can they just assume those programmers were specifying their salary in US dollars?
So I decided to create my own version of the graph, and hopefully not leave so many questions hanging in the air. In my version, I only plotted the salaries where the programmer's currency was US dollars. Also, the data provided enough granularity that I could also split out a 20+ years category. I labeled each line (rather than using a color legend), and simplified the graph a bit.
Their article also had graphs by country. ... Or is it by currency? Or is it by a combination of country & currency (you could live in a certain country, and get paid in a different currency, right?) I think these individual bar charts are better than the article's combined line chart, but there was still some room for improvement (see if you can spot things you would like to change, before you scroll down to my version!)
And of course, this wouldn't be much of a Graph Guy blog, if I didn't try to improve their graphs, eh?!?... Here is a list of some of my changes:
- Why not put 'Both' between the Spaces & Tabs bars, rather than on the end?
- Their bars go higher than the last axis tick mark, so you can't really tell how high the values go (especially for the India and Other graphs).
- Their title says the graphs represent 12,246 respondents ... but I would like to know how many respondents were represented in each individual graph.
- Bar charts are simple ... but perhaps too simple in this case. I'd really like to see how the individual data points were spread out.
I decided to use a box plot rather than a bar chart. I created a separate one for each currency available in the data. Below are a few of my plots - you can see the plots for each currency by scrolling down through the following page.
Based on the results of this particular survey, it appears that the professional developers who use spaces are indeed paid more than those who use tabs. Any tab-sters out there thinking about switching now?
And you might be asking yourself "What does The Graph Guy use?!?" ... spaces, of course! :)
Spaces rule! :) And this fun article too!
So tired of messed up code depending on the editor,
Having said that, most programmers don't even bother aligning anything.
Great article, and presented in a fun way - something I think every presentation should do. I don't know much about the Developer Survey but one thing that I think is missing the job type: permanent or freelance/contracting. Freelancers tend to earn more than permanent staff regardless of the number of years. So, perhaps, this should have been taken into consideration.
For me it's when editing, it is tabs all the way. Tabs offer me fewer keystrokes and a much faster and reliable code layout, theis is especially useful when I want to use my style of indentation on nested Do/End loops..
Even if you only use tabs as a surrogate way of entering four space-key characters, think of those keystrokes that you are saving, a 75% saving is pretty good.
I'm happy for people to use the [Insert spaces for tabs] option in the SAS Enterprise Guide editor options or the [Substitute spaces for tabs] preference in SAS Studio, that way we are all happy, you get your spaces but you don't have to clatter the keys so much..
In response to Kevin's post, I have never come across a problem where a tab character was used within SAS code whether in batch or interactive mode. I have only seen general difficulties when data is supplied in a text file where the columns are tab-separated but the programmer believes that the columns are space-separated or vice versa.
This helped me solve our problem: https://communities.sas.com/t5/Base-SAS-Programming/datalines-embedded-tabs-interactive-vs-batch/td-p/138280
I would have to re-visit the issue to remember what caused the issue for us, but I think it was parsing the path names. I use the option in the Enhanced Edit to convert tabs to spaces. If I need to incorporate a tab into a string, then I use "09"x and concatenation.
This may not be an issue with EG. The programmers in this shop probably did not batch submit before, but when we had to run 300+ programs, I wrote a macro (not a .bat file).
Tabs versus spaces is not just a formatting issue. Tabs can cause issues if one batch submits a program in SAS on a Windows machine that might not arise if one submits that same program interactively.
I wonder if it's at least partly an age thing - I learnt programming hand-writing COBOL on coding sheets thirty years ago (yes I am that old...) where your code had to be aligned to fit certain areas of the line. It seems to me that it's easier to preserve that accuracy and consistency if you use tabs which as far as I recall I've always done ever since.
Hey Robert, great article! I like your improvements to these graphs, and I think the changes increase the integrity of the data and take out some potential confounds. However, I would be interested to investigate if there is some third variable at play. For example, maybe programmers who use spaces are more flexible with how they write and adapt their code and perhaps this causes employers and supervisors pay them more because these employers and supervisors like the programmer's willingness to adjust the code to the employers and supervisor's preferences. Or perhaps programmers who use spaces are more meticulous with their code because they have to be careful about using the correct amount of spacing, and as a result their code is usually cleaner, making it easier to use and read and thus causing these programmers to be paid more. I don't have real basis or data to back up these speculations, and it is not that I think these are accurate explanations, I am simply pointing out the possibility of a third variable and my interest as to what that might be. As someone with a background in research, I am, like you were with the original data and graphs, skeptical of data and correlation relationships.
Either way, it was a nice improvement on the data and an entertaining read. Thanks!
Perhaps the question we need to dig into is "why do people use spaces or tabs?" Is it in their training? Is it their mindset/mentality? Is it something forced by the language/IDE they're using (older languages -vs- newer languages) ? Is it from their background (engineering -vs- liberal arts?)