How I Passed the Google Cloud Professional Data Engineer Certification Exam
Without recommended three years of practical experience
* Note: the article is devoted to the Google Cloud Professional Data Engineer certification exam, which was valid until March 29, 2019. After that, some changes occurred - they are described in the " Advanced " section *
Google sweatshirt: yes. Serious facial expression: yes. Photos from the video version of this article on YouTube .
Want to get a brand new sweatshirt like in my photo?
Or maybe you're interested in the Google Cloud Professional Data Engineer certificateand are trying to figure out how to get it?
Over the past few months, I went through several courses and worked with Google Cloud in parallel to prepare for the Professional Data Engineer exam. Then I went to the exam and passed it. A few weeks later, a sweatshirt arrived - but the certificate came faster.
This article will provide some information you might find helpful and the steps I took to get my Google Cloud Professional Data Engineer certificate.
Translated to Alconost
Why get a Google Cloud Professional Data Engineer certificate?
Data surrounds us, it is everywhere. Therefore, today experts are in demand who know how to create systems that can process and use data. And Google Cloud provides the infrastructure for building these systems.
If you already have Google Cloud skills, how can I demonstrate them to a future employer or client? There are two ways to do this: having a portfolio of projects or having passed certification.
The certificate tells potential customers and employers that you have certain skills and that you have made efforts to obtain their official confirmation.
This is stated in the official description of the exam.
Demonstrate your ability to design and build data processing systems and machine learning models on the Google Cloud platform.
If you do not already have the appropriate skills, then when you study the training materials for certification, you will learn everything you need about how to create the highest level data processing systems using Google Cloud.
Who needs a Google Cloud Professional Data Engineer certificate?
You saw the numbers - the sphere of cloud technologies is growing, they are with us for a long time. If you are not familiar with statistics, just believe: the “clouds” are on the rise.
If you’re already working as a data processing or analysis specialist, machine learning engineer, or want to move into the data processing industry, then Google Cloud Professional Data Engineer certification is what you need.
The ability to use cloud technologies is becoming a mandatory requirement for all professionals working with data.
Do I need a certificate to be a professional in processing, data analysis or machine learning?
You can use Google Cloud to work with data processing solutions without a certificate.
A certificate is just one way to confirm your existing skills.
How much is it?
The cost of passing the exam is $ 200. If you fail him, you will have to pay again.
In addition, you will have to spend money on preparatory courses and using the platform itself.
The cost of working with the platform is the fee for using Google Cloud services. If you are its active user, you are well aware of this. If you’re a newbie and just starting to learn the tutorials described in this article, you can create a Google Cloud account and do everything you need, keeping in with the $ 300 that Google credits to your account upon registration.
We will go over to the cost of courses literally in an instant.
How long is the certificate valid?
Two years. After this period, the exam must be taken again.
And since Google Cloud is constantly evolving, it is likely that certification requirements will also change (this happened just when I started writing the article).
What do you need to prepare for the exam?
For professional certification, Google recommends having more than three years of industry experience and more than a year in developing and managing solutions using GCP.
I had none of this.
The corresponding experience was about six months in each case.
To fill the gap, I used several training online resources.
What courses have I taken?
If your case is similar to mine and you do not meet the recommended requirements, then to improve your own level, you can take some courses from the following.
I used them in preparation for certification. They are listed in order of passage.
For each, I indicated the cost, timing and usefulness for passing the certification exam.
Some of the cool online learning resources I used to improve my skills before the exam are in order: A Cloud Guru , Linux Academy , Coursera .
Cost: $ 49 per month (after a 7-day free trial).
Time: 1-2 months, more than 10 hours a week.
Usefulness: 8 out of 10.
The course Data Engineering on Google Cloud Platform Specilization on the Coursera platform was developed in collaboration with Google Cloud.
It is divided into five nested courses, each of which is about 10 hours of study time per week.
If you're not familiar with data processing on Google Cloud, this specialization will just give you the skills you need. You have to complete a series of practical exercises using an iterative platform called QwikLabs. Before this, there will be lectures by specialists using Google Cloud on how to use various services, such as Google BigQuery, Cloud Dataproc, Dataflow and Bigtable.
Time: 1 week, 4-6 hours.
Usefulness: 4 out of 10.
A low rating of usefulness does not mean that the course is generally useless - this is not at all the case. The only reason the rating is so low is because it is not focused on Professional Data Engineer certification (as the name implies).
I went through it to refresh my knowledge after completing the Coursera specialization, since I used Google Cloud in some limited cases.
If you’ve previously worked with another cloud service provider or have never used Google Cloud, this course may be useful for you: this is a great introduction to the Google Cloud platform as a whole.
Cost: $ 49 per month (after a 7-day free trial).
Time: 1-4 weeks, more than 4 hours a week.
Usefulness: 10 out of 10.
Having passed the exam and thinking about the courses I can say that the Linux Academy Google Certified Professional Data Engineer was the most useful.
Video tutorials, as well as the Data Dossier e-book (an excellent free training resource provided with the course) and practice exams make this course one of the best I've ever completed.
I even recommended it as a reference in notes in Slack for the team after the exam.
Notes in Slack
- Some questions on the exam were not covered in the Linux Academy course, nor in A Cloud Guru, nor in Google Cloud Practice exams (which was to be expected).
- In one question, there was a graph of data points. It was asked by what equation they can be grouped (for example, cos (X) or X² + Y²).
- You must know the differences between Dataflow, Dataproc, Datastore, Bigtable, BigQuery, Pub / Sub and understand how you can use them.
- Two specific examples on the exam are the same as they were on the training, although during the exam I did not read them at all (the questions themselves were enough to answer).
- It’s useful to know the basic syntax of SQL queries, especially for BigQuery questions.
- The practice exams in Linux Academy and GCP courses are very similar in style to the questions in the exam - they should be passed several times to find their own weaknesses.
- Keep in mind that Dataproc works with Hadoop , Spark , Hive and Pigs .
- Dataflow works with Apache Beam .
- Cloud Spanner is a database originally developed for the cloud, it is compatible with ACID and works anywhere in the world.
- It is useful to know the names of the “oldies” - the equivalents of relational and non-relational databases (for example, MongoDB, Cassandra).
- The IAM roles of services are slightly different, but it would be nice to understand how to divide the ability for users to see data and design workflows (for example, you can design workflows in the role of Dataflow Worker, but you can’t see the data).
So far, this is perhaps enough. Each exam will be held in its own way. The Linux Academy course will provide 80% of the required knowledge.
Time: 1-2 hours.
Usefulness: 5 out of 10.
These videos were recommended on the A Cloud Guru forums. Many of them are not related to Professional Data Engineer certification, so I just chose those with the name of the services in which I thought they were familiar.
During the course, some services may seem complicated, so it was nice to see how a particular service was described in just a minute.
Cost: $ 49 per certificate or free (without certificate).
Time: 1-2 weeks, more than six hours a week.
Usefulness: not rated.
I found this resource the day before the appointed exam date. There was not enough time to go through it - hence the lack of a utility rating.
However, looking at the course’s overview page, I can say that this is a great resource where you can repeat everything you’ve learned about Data Engineering on Google Cloud and find your weak points.
I spoke about this course to one of my colleagues who is getting ready for certification.
Usefulness: not rated.
Another resource that I came across after the exam. It looks comprehensive, but the summary is quite brief. In addition, it is free. You can contact him between training exams and even after certification - to refresh your knowledge.
What did I do after the course?
Nearing completion of the courses, I booked an exam with a week notice.
The presence of a deadline is an excellent motivation to conduct an audit of what is learned.
I passed the Linux Academy and Google Cloud training exams several times until I began to consistently gain more than 95%.
The first passing Linux Academy training exam with a score of over 90%.
Tests for each platform are similar; I wrote down and sorted out questions in which I was constantly mistaken - this helped to eliminate weaknesses.
During the exam itself, the topic was the development of data processing systems in Google Cloud using two examples (the content of the exam has changed since March 29, 2019). The entire exam had multiple choice questions.
Passing the exam took two hours, it seemed to me about 20% more difficult than the familiar training exams.
However, the latter is a very valuable resource.
What would I change if I took the exam again?
More practice exams. More practical knowledge.
Of course, you can always prepare even a little better.
The recommended requirements indicate more than three years of experience using GCP, which I did not have - so I had to deal with what was.
The exam was updated on March 29th. The materials in the article will still provide a good basis for preparation, but it is important to note some changes.
Google Cloud Professional Data Engineer Exam Sections ( version 1 )
- Design of data processing systems.
- Building and maintaining data structures and databases.
- Data analysis and machine learning connectivity.
- Modeling business processes for analysis and optimization.
- Ensuring reliability.
- Data visualization and decision support.
- Design with a focus on safety and compliance.
Google Cloud Professional Data Engineer Exam Sections ( Version 2 )
- Design of data processing systems.
- Construction and operation of data processing systems.
- Operation of machine learning models (most of the changes have occurred here) [NEW] .
- Quality assurance solutions.
In version 2, sections 1, 2, 4, and 6 of version 1 are combined into sections 1 and 2, sections 5 and 7 into section 4. Section 3 in version 2 has been expanded to now cover all the new machine learning features in Google Cloud.
These changes have occurred recently, so many training materials did not have time to update.
However, if you use the materials from the article, this should be enough to cover 70% of the necessary knowledge. I would also familiarize myself with the following topics (they appeared in the second version of the exam):
- Google Machine Learning API (ML).
- Google Cloud Machine Learning Core.
- TPU for Google Cloud (equipment developed by Google specifically for machine learning).
- Google Glossary with Machine Learning Terms.
As you can see, the exam update is primarily related to machine learning capabilities in Google Cloud.
Update as of April 29, 2019. I received a message from a Linux Academy course teacher (Matthew Ulasien).
Just for reference: we plan to update the Data Engineer course at Linux Academy and reflect new goals in it - somewhere from mid or late May.
After passing the exam, you will get the result “passed” or “not passed”. At training exams, it is advised to strive for a minimum of 70%, so I aimed at 90%.
After passing the exam, you will receive an activation code via email along with the official Google Cloud Professional Data Engineer certificate. Congratulations!
The activation code can be used in the exclusive Google Cloud Professional Data Engineer store, where you can make good money: there are t-shirts, backpacks and sweatshirts (at the time of delivery, something may not be available). I chose a sweatshirt.
Having received a certificate, you can demonstrate your skills (officially) and return to the work that you do best - building systems.
See you in two years - on re-certification.
P. S. Many thanks to the wonderful teachers of the above courses and to Max Kelsen for providing resources and time for studying and preparing for the exam.
About the translator
Translation of the article was done in Alconost.
Alconost localizes games , applications and sites in 70 languages. Native translators, linguistic testing, cloud platform with API, continuous localization, project managers 24/7, any format of string resources.
We also make advertising and training videos - for sites that sell, image, advertising, training, teasers, expliner, trailers for Google Play and the App Store.
→ Read more