Scope

Distilling the lessons learnt from over a hundred Data Science intern applications where the majority of the candidates were not selected for interview. The focus of this article is on what and how to communicate rather than what technical skills to acquire.

Contents

  • Introduction
  • Importance of Reading the Intern Advert
  • Cover Letter
  • Curriculum Vitae / Resume
  • Personal Development
  • Summary
Data Science Guides by  | Image by Author

Introduction

This guide contains advice on the application process i.e submitting an application, cover letter and/or a Curriculum Vitae (CV)/ Resume for Data Science internships. It's based on over 100 applications by distilling the lessons learnt from more than 92% of applicants who did not get an interview and 97% of whom did not get the position.

Although there are some articles around on how to apply for Data Science internships with Tech companies or start-ups they don’t address some of the specifics with more “traditional” businesses as Data Science positions open up with Human Resources (HR), Finance and even dedicated departments within Manufacturing companies.

Most of the applicants made it very difficult to assess their capabilities and their suitability because they failed to communicate effectively or did not demonstrate their commitment to self-development, in light of low barrier to entry for Data Science.

We also need to consider that there are some cultural and country specific differences. For example, in the UK we don’t traditionally have Résumés but instead CVs (Curriculum Vitae), which is Latin for “course of life”; the differences are addressed in this site:

There are also different expectations around length (1, 2 or more pages) and the need and purpose of a cover letter.

These tips are based on submitting a CV and writing a Cover Letter, which will be the same content required in typical application forms and therefore applicable in most instances.

Read, Read and Read!

… the placement/internship details! A significant amount of effort goes into writing a placement advert, including the type of technical skills, personal skills and any expected experience. The application process is typically an electronic form, a CV or a combination of a Cover Letter and CV. Rather than guess if there are page limits or restrictions, most recruiters would be happy to respond to any questions — so don’t be afraid to ask them directly.

Cover Letter

Photo by Kate Macate on Unsplash

What is a Cover Letter?

The main purpose of a Cover Letter is to state why you are suitable for the role and should be used to make it easy for the recruiter to understand the key elements of your application and how you address the requirements stipulated in the placement advert. It should also include the following details:

  • Your name and contact details (address, phone and email)
  • The date of correspondence
  • The position you’re applying for
  • Personalised content (see below)

The primary reason for including these details is that the person who receives the application (e.g. HR) and manages the administration is unlikely to be the person that assesses it. Therefore we need to make the application as robust as possible to human error as it works its way through an organisation.

Your contact details are required as it’s easy to misplace the letter relative to your CV once it hits the company shared drive and naturally you want the recruiter to contact you. Similarly, the employer may have multiple positions to fill so helps them categorise and sort your application.

Letter Structure & Format

Although the content of the letter is important, the format is an easy way to impress a prospective employer by demonstrating those advanced “Office Skills” you refer to in your CV (😏) and helps demonstrate your professional communication skills. Remember that for most companies, Data Scientists are expected to communicate effectively with the rest of the business, and they‘re unlikely to speak Slack (or your chat based communication of choice). In fact, those who fail to format their documents appropriately but claim to be advanced “Office Software” users are likely to suffer a credibility hit.

That’s a lot to communicate from formatting a letter, however, 80–90% of the applicants failed to make it easy for the prospective employer by not formatting appropriately so it’s an easy way for your application to stand out.

The following example is UK specific, but shows the main elements:

Cover Letter Structure and Formatting by  | Image by the Author

The image above shows 11 key elements that make up a Cover Letter:

  1. The left hand side should begin with the name of the recruiter if known and the full address of the organisation.
  2. The right hand side should contain your full name, address and preferred methods of contact (e.g. phone and/or email address); the latter if you email across your application. Note that the text is left justified — this can be achieved in any word processing software using tables with invisible borders (or python-docx 😃).
  3. The date of the letter should be one line below either address (yours or the organisation’s). These three items are very much country specific so worthwhile finding out what is the norm where you’re applying.
  4. The opening salutation or greeting should be framed on whether you know the recruiters name or not. “Dear Sir/Madam” is a little old-fashioned but accepted, but depending on the industry sometimes “Dear Recruitment Team” is also accepted. Note that Item 9 is dependent on what you state here.
  5. The subject line should be a combination of bold and/or underlined and functions the same as the subject line of an email. In this instance, it should clearly state that you are applying for a Data Science intern position.
  6. The opening line is a single sentence that succinctly captures the intent of the letter. For most interns applying from University, something like “I’m currently on a 4-year undergraduate Masters in Mechanical Engineering degree from the University of Rotherham…”.
  7. The content of the main body of text is addressed in the next section.
  8. The closing line is a single sentence that underlines the purpose of the letter, in which this in this case is an application for internship which you’d like the recruiter to consider.
  9. To close the letter, typically “Yours faithfully” is used if the letter isn’t addressed to a specific name or “Yours sincerely” if it is. Again, this is country and industry specific.
  10. For emailed submissions or electronic applications, it’s not required to add a scanned signature but some people do.
  11. As the Cover Letter is accompanying a CV or Resume, it’s common practice to remind the reader that there are the equivalent of attachments or Enclosed items by placing “Enc.” in the footer.

These items are applicable for most roles not just those in Data Science. With each item, it is an opportunity to communicate more than just the content (which is important) but also an opportunity to display office software skills, the ability to understand what’s expected in a professional setting but also to convey a degree of respect that you have followed an established protocol.

Cover Letter Content

If the internship or placement advertisement is the exam question, then the Cover Letter is your answer. It’s a challenge on how to be concise, yet convey the most positive aspects of your application. The content (like the CV) should be tuned to each prospective employer and role. Certainly, for most UK applications, a single page (of A4 or US Letter size) should be sufficient. If you choose to write more, then you need to ensure the added length has a disproportionate impact on your application. I would certainly recommend sticking to a single page.

The CV and Cover Letter should be considered in tandem to minimise the level of duplicated content. The CV will stipulate certain facts but is constrained by the structure, where some of the facts have to be repeated in the Cover Letter but in general, there is more freedom to make your case.

There is no generic answer to the content you need to provide; you have to look at the advertisement and try and address the specifics. Broadly, it should convey three things:

  1. Your enthusiasm for the role
  2. Your experience, skills and other measures of suitability for the role
  3. Your key people orientated skills

Naturally, you should highlight the positive elements of your application (“sell yourself”) however that does not mean make things up or be dishonest. There is nothing more damning for a recruiter than to invite an individual to interview only to discover that much of the application was exaggerated. For example, suppose you don’t work well in a team, then point out that you are self-motivated and can work effectively alone — this highlights the positive aspect of your application.

There is also a growing tendency to cite online Data Science profiles either though git repositories (e.g. Github) or something like Kaggle. This can have both a positive and negative impact. If you choose to share your profile, then make sure there is some recent content. I’ve been to a number of GitHub profiles, where either there were no Data Science content, no easy to follow README.md or hadn’t been updated in over six months. These can have a negative impact on your application.

Likewise, a well maintained, easy to follow repository or profile, with code and good comments can have a very good impact on your application. The mentality with the Cover LetterOnline Profiles and the CV is the same: you have to make every cm/px/inch of your application count!

The Basics

It’s hard to underestimate the number of applications that are submitted that haven’t been spell-checked, proof read or otherwise checked prior to submission. It can be deeply corrosive if someone says they’re good at paying attention to detail and for them to then make simple spelling or grammatical errors.

The same applies with “wild” font choices or very small font sizes; most educational institutions normally set a basic standard for report or essay writing that can be equally applied here or alternatively use Google for sensible choices.

Cover Letter Feedback

I began this article by identifying the lessons learnt from a large number of applicants who did not get an interview (>92%) based on their applications. Overall those applicants were not able to communicate why their candidacy was worth inviting them to interview.

Very few of the letters were formatted appropriately and therefore easy to read. The vast majority replicated or duplicated content from their CV into the Cover Letter without putting it into context for the role. Or put another way, the letter was not specific for this application. Some candidates were very heavy with flattery (“your the best company in the world”) which is off putting, some struggled to highlight any positive elements of their application and some unfortunately misunderstood what the business did.

The successful candidates (on the whole) followed the tips outlined above.

Curriculum Vitae / Resume

Photo by Annie Spratt on Unsplash

What is a Curriculum Vitae?

The term Curriculum Vitae, which means “Course of Life” in Latin, has different meanings depending on geographic locations. In the US it has a specific structure which is different to that of a Resume. In the UK, the aims and content are very similar between a CV and Resume outside of academic roles and applications. So if you’re looking online at the differences bear that in mind.

For our purposes the CV contains some facts about an individual such as name, contact details, brief education history, work experience, interests, hobbies and referees (although now often provided on request). Although the structure is set, the level of content can and should vary per application.

The following tips are based on many unsuccessful applications for internships and have been summarised as a series of Do’s and Don’ts.

Provide a Complete Timeline

A number of candidates omitted certain periods from their CVs, either because they were doing jobs they didn’t believe were relevant or sometimes to disguise full time employment. The latter is due to a number of candidates applying for internships having already completed their academic studies.

You might think a gap makes you look mysterious, but a recruiter could easily imagine that you were in prison! A better approach is to provide a full timeline but scale the content to match the role.

Scaling CV Contents for a Given Application by  | Image by the Author

The figure above shows that a complete timeline is provided but the contents are tuned or scaled for the role or application being made. It should be possible to highlight or extract something from every role e.g. something about Health and Safety or experience dealing with finances or speaking to different customers. It doesn’t all have to be about Python or R.

Avoid Skill Ratings

Avoid Self-Ratings | Image by the Author

Skill pills are a popular trend for displaying programming knowledge, but self-ratings such as Beginner / Intermediate / Advanced don’t add much value. With over 10 years worth of statistical plotting experience and a user of seaborn for nearly half that time, I wouldn’t consider myself to be an ‘Advanced' user of the library. Such ratings mean different things to different people. It’s better to save the space and use it more effectively elsewhere.

Avoid Incomplete Courses

Image by Jan Vašek from Pixabay

Another trend is to advertise what additional courses are currently in progress usually in PythonR, Data Science and/or Machine Learning. Although on one hand, it's good to see such initiative on the other hand there is also a trend where the courses aren’t completed, or citing what they’ve learned from the course or how they’ve applied it. This often unravels in an interview; remember everything you state in your CV can form the basis of a question so needs to be substantiated, otherwise, it will affect your creditability.

Provide Evidence of Actual Learning

Completing a course on Machine Learning and understanding Machine Learning are not the same and this is true of all courses. Sadly many students at University are still unaware of this fact.

The better test is when the candidate applies their knowledge in an area not covered by the original course. This could be using an alternate data set from Kaggle like The Pokemon Dataset or applying the same techniques in an area they are interested in. This shows initiative, interest and enthusiasm.

If all the examples you can cite were part of the University or Education Institution course, then you’re unlikely to be competitive against some of the other candidates.

Demonstrate Computing Skills

Image by planet_fox from Pixabay

A number of candidates didn’t inspire a lot of confidence in their Data Science skills because it became apparent they lacked basic computer literacy. This can be addressed by understanding some of the related technologies such as basic networking, web technologies and so forth. It could be as simple as using a Virtual Machine or installing a technology stack. In contrast, candidates that owned and managed Internet of Things (IoT) type devices such as Raspberry Pis demonstrated high credibility because they understood what it took to collect, store and transfer the data.

A number of candidates couldn’t be sure they could install a Data Science environment on their own laptop 🙈. This does not inspire a lot of confidence in their skills overall.

Only Share Online Profiles If It Adds Value

Photo by Pankaj Patel on Unsplash

It’s common practice to share GitHub profiles, which I mean public repositories as almost a status symbol. Many are created as part of University courses and therefore shared on CVs. As stated previously, if you choose to share an online profile then it must add value by highlighting your skills, showing good practices and/or extracurricular activities.

If the profile is empty, out of date, broken, disorganised or showcases bad habits then it will harm your application.

Personal Productivity

Photo by sporlab on Unsplash

Many of the CVs listed hobbies and interests but did not stipulate how the candidate maintains personal productivity or looks after their mental health. A mark or feature of a professional is that they manage their time by taking regular breaks to maintain their personal productivity and/or engages with exercise (e.g. walking, running or sports) to maintain their health and mental welfare. Proudly declaring one’s ability to sit in front for a computer for 10 hours continuously indicates a lack of maturity.

With the pandemic still affecting many companies worldwide, most of the CVs did not mention how the candidates coped with remote learning. These aspects can easily be intertwined when discussing University courses or previous employment.

CV Feedback

Having focused on the lessons learnt from candidates who were not invited for interview, what were the positive aspects?

The vast majority had most (but not all) of the points listed above. The best applications showed examples of self development and applying their learning on personal projects. These were easy to read, understand and relate to.

Personal Development

Other disciplines such as Aeronautical or Mechanical Engineering requires access to specialist knowledge and facilities which can only be obtained in specialist companies or Universities. The 'barrier for entry in these fields is high i.e. spending lots of money to replicate at home is not possible.

Data Science on the other hand can be close to free or at the least, low cost. For example, for Aerodynamics one need access to specialist text books but for the latest discoveries a paid subscription to a Scientific Journal service can cost in the order of £2000-4000 per year. In contrast, many of the key text books on Data Manipulation and Machine Learning are free as Jupyter Notebooks. There are plenty of resources online that cover Python and the R language. The majority (if not all) of the key papers on Machine Learning are also available for free download via arxiv.

Unlike other fields, the majority of tools used for Data Science are free and largely open source as well. Given the barrier to entry is low, the expectations are that candidates will spend some of their own time supplementing their University courses with their own learning. This desire for self-development and the application of what has been learnt is probably the most important characteristic sought after from an intern in Data Science.

Summary

The lessons learnt from over a 100 intern applications in the field of Data Science have been distilled down into their core elements. Although not everyone will use a combination of Cover Letter and CV/Resume, the key principles have been highlighted where even how someone formats a letter can potentially convey useful information about a candidate.

If you have any further tips or even contrasting experiences then please share in the comments.

Attribution

All gists , notebooks and terminal casts are by the author. All of the artwork is based on assets explicitly CC0, Public Domain license or SIL OFL and is therefore non-infringing. Theme is inspired by and based on my favourite vim theme: Gruvbox.