December 21, 2020

How to become a Python developer

In this post I present a Python software developer study plan. It consists of these 8 not-so-easy steps: Pick a project Choose tech specialization Learn Python basics Practice programming Learn the ecosystem Study computer science Prepare yourself for the job Find a mentor Why I’m writing that One of my former colleagues asked me a how to become a Python developer. Of course, there are thousands of courses, boot camps and different programs helping people to start a developer career. Read more

September 6, 2020

Advanced fixtures with pytest

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Now let’s create another test - it will test integration between our 2 components that talk to external systems - API and database cache. Let’s test that when we query a number twice - we call API only once that the result is saved to the database and fetched from it on the second call. Read more

September 6, 2020

Hello, World!

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins In this course, we will be working on a mobile phone number validation application. The application: - Accepts a number as input - For every number in the list - Normalize the number - Check cache if this number was validated before - If it’s not in cache call external service REST API to validate the number - print the normalized number and the result of validation Read more

September 6, 2020

Pytest plugins

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins There are lot of plugins in pytest ecosystem. Some of the widely used are listed here All the plugins can be installed with pip and invoked by providing an argument to pytest executable. pytest-cov This plugin calculates test coverage - how much of our code is covered by test. Read more

September 6, 2020

Selecting tests with pytest

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Let’s add another requirement for our normalize function - it will raise an exception if the number contains a letter, or if a plus sign is not at the beginning. Now let’s think a bit about the design of the application. Read more

September 6, 2020

Test driven Development

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Is it better to write test cases after the code has been written or beforehand? Usually, it’s cheaper to detect bugs as early as possible in the development process. And writing test cases first will minimize the time between when a defect is inserted into the code and when the defect is detected and removed. Read more

September 6, 2020

Testing database with pytest

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins We are going to use a database in our number testing application as a cache for API call results - API calls can be costly and we don’t want to check the same number twice against it. Read more

September 6, 2020

Testing HTTP client with pytest

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Now let’s move to checking if the number exists or not. For that, we are going to employ a 3rd party API. According to API docs: - It’s a REST API - We need to use HTTP GET - We provide a number in query parameters - The result is a json {‘existing’: True | False} Read more

September 6, 2020

Types of tests

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins There are many types of tests. Brian Marick came up with this chart, which is widely used to show which types you should care about in order to deliver a high-quality application. In this diagram, he categorized tests according to whether they are business-facing or technology-facing, and whether they support the development process or are used to critique the project. Read more

September 6, 2020

Why testing is important

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Testing makes our code flexible, reliable and reusable. Flexible If you don’t have tests for your code - every change to the code is a possible bug. Thus, developers fear making changes and implementing new features, no matter how flexible the architecture of the application is. Read more

August 23, 2020

Producer consumer pattern for fast data pipelines

Let’s build a simple data pipeline. It will read a SOURCE table in MySQL database. This table has only one column VALUES and contains 1000 rows - numbers from 1 to 1000. The application will calculate square root of the number and put the result in DESTINATION table. This table will have 2 columns - VALUE column holding original number - ROOT column for the result of calculation Very simple implementation of this pipeline can be like this one: Read more

August 13, 2020

Speeding up Python code with Cython

Cython is an extension of Python, that adds static typing to variables, functions and classes. It combines simplicity of Python and efficiency of C. You can rewrite your code in Cython and compile them to C to achieve higher execution speed. In this tutorial, you’ll learn how to: Install Cython and Compile Cython code something about speed Write Cython application with statically typed variables and C functions What Cython is and what it’s used for The Cython project consists of two parts - a programming language and a compiler. Read more

November 4, 2019

Python linters for better code quality

Code quality There are two types of software quality - external and internal. External are the ones that are important to the users of the system. They may include: correctness - software behaves as users expect usability - how easy is it to use reliability - the ability to function under any circumstances Internal quality characteristics are what developers care about: maintainability - how easy the software can be modified readability - how easy new developers can understand what code is doing by reading it testability - how easy the systems could be tested to verify that it satisfies the requirements The internal characteristics relate closely with the quality of the code and design. Read more

September 15, 2019

Representing money in Python

Python’s float type is a natural first step to represent monetary amounts in the code. Almost all platforms map Python floats to IEEE-754 “double precision”. Doubles contain 53 bits of precision. When the machine is trying to represent the fractional part (mantissa) of a given number it finds a bit sequence \(b_1, b_2 ... b_{53}\) so that a sum: $$ b_1(\frac{1}{2})^{1} + b_2(\frac{1}{2})^{2} + ... + b_{53}(\frac{1}{2})^{53} $$ is close to the number as possible. Read more

August 18, 2019

CI/CD pipeline for AWS Lambda (Python runtime)

Continuous integration and continuous delivery are powerful practices that allow release software faster and of a higher quality. This post walks through steps to implement CI/CD pipeline for a small lambda function that calculates square roots by: getting message from SQS that contains the number to calculate sqrt for checks if the calculation was done before by querying DynamoDB if there is not cached answer in DynamoDB - calculate sqrt and saves the result print the result so it’s visible in CloudWatch logs Things I’d like the pipeline to do: Read more

April 16, 2018

Extracting keyphrases from texts: unsupervised algorithm TopicRank

Keyphrase extraction is the task of identifying single or multi-word expressions that represent the main topics of a document. There are 2 approaches to extract topics (and/or keyphrases) from a text: supervised and unsupervised. Supervised approach This is a multi-label, multi-class classification algorithm, where following features can be used as an input: text converted to bag-of-words text is treated as a stream of vectors, which are pre-trained word embeddings For bag-of-words linear SVM is a good classifier. Read more

April 8, 2018

E-commerce recommendation systems: basket analysis.

Once novelty recommendation systems are used now by more and more e-commerce sites to help customers find products to purchase. For e-commerce business owners these tools facilitate cross-sales. Usage Amazon is one of the most prominent organizations that used recommendations to increase sales. According to fortune.com Amazon was able to increase sales by 29% in 2012 as a result of implementing recommendation system. 35% of Amazon’s revenue is generated by its recommendation engine (source). Read more

© Alexey Smirnov 2021