October 11, 2023

DynamoDB - pros and cons

Introduction Choosing the right database for your application is a critical decision in software development. DynamoDB, Amazon Web Services NoSQL database, is often a top consideration. In this blog post, I’ll discuss the pros and cons of using DynamoDB for your app, covering its history, roots, use cases, alternatives, advantages, and challenges. By the end, you’ll have a clear understanding of whether DynamoDB is the right choice for your project. Read more

January 12, 2021

Rotoscoping with OpenCV/C++

One of the most basic problems in VFX is separating an object in a video from its background. The object then composited back into new environment or scene. This task has different names such as matting, keying or more popular term rotoscoping. In this demo you can see how a church in the foreground is separated from the background blue sky, which was later replaced with a sky full of stars. Read more

December 21, 2020

How to become a Python developer

In this post I present a Python software developer study plan. It consists of these 8 not-so-easy steps: Pick a project Choose tech specialization Learn Python basics Practice programming Learn the ecosystem Study computer science Prepare yourself for the job Find a mentor Why I’m writing that One of my former colleagues asked me a how to become a Python developer. Of course, there are thousands of courses, boot camps and different programs helping people to start a developer career. Read more

October 14, 2020

Chroma Keying with OpenCV/C++

Chroma keying - or blue/green screen matting - is a process of removing a specific color from the video to be replaced with another picture or video. Historically green or blue colors were used as a background because they are not dominant in human skin or clothes. However, when a weather forecaster puts on a green skirt it can lead to funny situations: Chroma keying became very popular technique not only on TV but in the movies. Read more

September 14, 2020

Uploading files to AWS S3 with Flask

One way to upload files using Flask is to literally create a route that accepts HTTP POST and saves bytes received on the disk. And with horizontal scaling you need to mount an external storage to every running instance that supports replication. Another option is to use object storage - like AWS S3 - and upload files directly from the frontend. In that case Flask will have a route that just generates and URL a frontend will upload to. Read more

September 6, 2020

Advanced fixtures with pytest

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Now let’s create another test - it will test integration between our 2 components that talk to external systems - API and database cache. Let’s test that when we query a number twice - we call API only once that the result is saved to the database and fetched from it on the second call. Read more

September 6, 2020

Hello, World!

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins In this course, we will be working on a mobile phone number validation application. The application: Accepts a number as input For every number in the list Normalize the number Check cache if this number was validated before If it’s not in cache call external service REST API to validate the number print the normalized number and the result of validation Let’s start with the Normalize step. Read more

September 6, 2020

Pytest plugins

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins There are lot of plugins in pytest ecosystem. Some of the widely used are listed here All the plugins can be installed with pip and invoked by providing an argument to pytest executable. pytest-cov This plugin calculates test coverage - how much of our code is covered by test. Read more

September 6, 2020

Selecting tests with pytest

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Let’s add another requirement for our normalize function - it will raise an exception if the number contains a letter, or if a plus sign is not at the beginning. Now let’s think a bit about the design of the application. Read more

September 6, 2020

Test driven Development

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Is it better to write test cases after the code has been written or beforehand? Usually, it’s cheaper to detect bugs as early as possible in the development process. And writing test cases first will minimize the time between when a defect is inserted into the code and when the defect is detected and removed. Read more

September 6, 2020

Testing database with pytest

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins We are going to use a database in our number testing application as a cache for API call results - API calls can be costly and we don’t want to check the same number twice against it. Read more

September 6, 2020

Testing HTTP client with pytest

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Now let’s move to checking if the number exists or not. For that, we are going to employ a 3rd party API. According to API docs: It’s a REST API We need to use HTTP GET We provide a number in query parameters The result is a json {‘existing’: True | False} I’m going to create this 3rd party API myself and run it from my local environment so we can see the access logs. Read more

September 6, 2020

Types of tests

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins There are many types of tests. Brian Marick came up with this chart, which is widely used to show which types you should care about in order to deliver a high-quality application. In this diagram, he categorized tests according to whether they are business-facing or technology-facing, and whether they support the development process or are used to critique the project. Read more

September 6, 2020

Why testing is important

Other pytest articles: Why testing is important Types of tests Test driven Development Hello, World! Selecting tests with pytest Testing HTTP client with pytest Testing database with pytest Advanced fixtures with pytest Pytest plugins Testing makes our code flexible, reliable and reusable. Flexible If you don’t have tests for your code - every change to the code is a possible bug. Thus, developers fear making changes and implementing new features, no matter how flexible the architecture of the application is. Read more

August 23, 2020

Producer consumer pattern for fast data pipelines

Let’s build a simple data pipeline. It will read a SOURCE table in MySQL database. This table has only one column VALUES and contains 1000 rows - numbers from 1 to 1000. The application will calculate square root of the number and put the result in DESTINATION table. This table will have 2 columns VALUE column holding original number ROOT column for the result of calculation Very simple implementation of this pipeline can be like this one: Read more

August 13, 2020

Speeding up Python code with Cython

Cython is an extension of Python, that adds static typing to variables, functions and classes. It combines simplicity of Python and efficiency of C. You can rewrite your code in Cython and compile them to C to achieve higher execution speed. In this tutorial, you’ll learn how to: Install Cython and Compile Cython code something about speed Write Cython application with statically typed variables and C functions What Cython is and what it’s used for The Cython project consists of two parts - a programming language and a compiler. Read more

July 9, 2020

Accepting payments in Flask with Stripe

Introduction In this article you’ll learn how to use Stripe Checkout to accept one time payments in Flask application. THe example will be a webshop, that has a single page for selling 5$ T-shirts. Main page Create a Flask route that serves the webshop page. The page loads some JavaScript as well: a Stripe JS jQuery for AJAX call some custom JavaScript @app.route('/') def webshop(): return """<html> <head></head> <body> <a href="#" id="checkout">Buy T-shirt for 5$</a> <script src="https://code. Read more

May 7, 2020

How to increase Flask performance

When Flask app runs slow we need to identify what is the bottleneck. It can be an overloaded database, unresponsive external API, or heavy, CPU-intensive computation. This is the whole recipe on how to speed up Flask - find the source of sluggish performance. After the bottleneck is identified you can fight an underlying cause. And here I assume that the underlying platform that runs Flask has enough resources to do so. Read more

April 12, 2020

5 ways to deploy Flask

In this post, I’m going to explore 5 ways to deploy a Flask application. In all examples I’m going to use a simple app from Flask docs: app.py from flask import Flask app = Flask(__name__) @app.route('/') def hello_world(): return 'Hello, World!' if __name__ == '__main__': app.run() Local machine This option is used when you need to test your application on a local machine. By simply running app. Read more

November 4, 2019

Python linters for better code quality

Code quality There are two types of software quality - external and internal. External are the ones that are important to the users of the system. They may include: correctness - software behaves as users expect usability - how easy is it to use reliability - the ability to function under any circumstances Internal quality characteristics are what developers care about: maintainability - how easy the software can be modified readability - how easy new developers can understand what code is doing by reading it testability - how easy the systems could be tested to verify that it satisfies the requirements The internal characteristics relate closely with the quality of the code and design. Read more

October 23, 2019

Run Flask on AWS ECS (Fargate)

There is an alternative to run Flask on AWS Elastic Beanstalk that allow numerous customization options - run Flask on ECS Fargate. This serverless (you don’t have to manage a cluster of EC2) solution runs Docker images and can run Flask web server. There is a lot of AWS resources involved to make it work. I’m sharing CloudFormation templates that will create them automatically.Source code Here are the details of these templates: Read more

October 1, 2019

Static website on AWS S3 with SSL and continuous delivery

AWS S3 is perfect to host static websites. Basic setup when you have a CNAME DNS record pointing to the bucket endpoint covers a lot of use cases. Couple of things missing are SSL continuous delivery. For SSL you need CloudFront to serve as a global load balancer and provide SSL offload. To achieve continues delivery connect the GitHub repo storing the source to CodePipeline. CodePipeline is triggered at every push to the master branch and automatically updates the content of the S3 bucket with changes source files. Read more

September 15, 2019

Representing money in Python

Python’s float type is a natural first step to represent monetary amounts in the code. Almost all platforms map Python floats to IEEE-754 “double precision”. Doubles contain 53 bits of precision. When the machine is trying to represent the fractional part (mantissa) of a given number it finds a bit sequence \(b_1, b_2 ... b_{53}\) so that a sum: $$ b_1(\frac{1}{2})^{1} + b_2(\frac{1}{2})^{2} + ... + b_{53}(\frac{1}{2})^{53} $$ is close to the number as possible. Read more

August 18, 2019

CI/CD pipeline for AWS Lambda (Python runtime)

Continuous integration and continuous delivery are powerful practices that allow release software faster and of a higher quality. This post walks through steps to implement CI/CD pipeline for a small lambda function that calculates square roots by: getting message from SQS that contains the number to calculate sqrt for checks if the calculation was done before by querying DynamoDB if there is not cached answer in DynamoDB - calculate sqrt and saves the result print the result so it’s visible in CloudWatch logs Things I’d like the pipeline to do: Read more

December 12, 2018

Streaming timeseries with Flask and Plotly

This post describes simple app for streaming cpu utilization to a web page. It uses Flask as websockets server (flask-socketio plugin), socket.io as client library and plotly.js for visualization. Flask app Follow flask-socketio doc to create a flask app. SocketIO is going to use Redis as message broker as there will be a separate process that pushes messages to clients. Flask websocket server and this process will communicate through Redis. Read more

November 27, 2018

Background jobs with Flask

Basic request lifecycle with Flask goes like this: Flask gets a request is parses input parameters does necessary calculations and finally returns the result This synchronous task is fine when a user needs the result of calculation immediately. Another use case is when the result is not relevant right now and the user just wants to schedule an execution of the task asynchronously. Such scenarios include: sending an email creating thumbnails from uploaded images starting a calculation for a long CPU intensive task Common implementation Asynchronous tasks are usually implemented like this: Read more

October 26, 2018

Multitenancy with Flask

What is multi-tenancy Consider a SaaS platform that provide access to multiple client organizations. These organizations - tenants - may have each its own database for safety and data protection reasons. It can be a database on a single RDBMS server or physically different servers. Usually additional central database (i.e., General) stores metadata and list of available tenants. Flask-SQLAlchemy Flask-SQLAlchemy provides interface only to one database. Flask app configuration defines SQLALCHEMY_DATABASE_URI for connection information for it. Read more

September 27, 2018

Flask pagination macro

In this post you’ll find out how to create a pagination with Jinja macro feature. Requirements: show preconfigured limited number of pages at once collapse invisible pages under ... provide previous/next navigation buttons Jinja templates for Bootstrap4 I’ve created 3 tier structure of Jinja templates to use Bootstrap4. First - bootstrap4_base.html - loads css and js files from CDN and defines major blocks: head - holds content of the <head> tag and defines title, metas, styles body - holds content of the <body> tag and defines navbar, content, scripts navbar - for navigation bar content - for boostrap container (tag with class="container") scripts - goes in the end of the body, here is why Blocks may be extended or/and overwritten in the later templatesThis template follows Bootstrap4 intro guide Read more

August 13, 2018

Running Flask in production with Docker

Google top for running Flask with Docker is full of posts where Flask runs in debug mode. That what logs look like when Flask is in development mode: * Serving Flask app "app" (lazy loading) * Environment: production WARNING: Do not use the development server in a production environment. Use a production WSGI server instead. * Debug mode: on * Running on http://0.0.0.0:5555/ (Press CTRL+C to quit) I’d like to make a tutorial on how to run it with uwsgi in Docker using common Docker images. Read more

July 11, 2018

Securing Flask web applications

In this post I’d like to investigate security mechanisms available in Flask. I’ll go through different types of possible vulnerabilities and the way they can be mitigated. XSS Cross-Site Scripting (XSS) attacks are a type of injection, in which malicious scripts are injected into otherwise benign and trusted websites. source Exploit Consider a form asking for a user input. <form method="post" action="/"> <input type="text" name="tweet"><br> <input type="submit"> </form> And a template to show tweets by other users where user input from above form passed unprocessed: Read more

May 9, 2018

Using NLTK library with AWS Lambda

This is a walk through of the process of creating a simple serverless app for finding part-of-speech tag of an input text. 1 Create virtual environment In order to separate system-wide dependencies from this app, create a separate virtual environment with: ~ mkvirtualenv nltk_env 2 Install nltk In the virtual environment use pip to install nltk package: (nltk_env) ~ pip install nltk 3 Download nltk data Pip doesn’t install additional files that are needed to the app, but nltk has a helper functions to download them: Read more

April 16, 2018

Extracting keyphrases from texts: unsupervised algorithm TopicRank

Keyphrase extraction is the task of identifying single or multi-word expressions that represent the main topics of a document. There are 2 approaches to extract topics (and/or keyphrases) from a text: supervised and unsupervised. Supervised approach This is a multi-label, multi-class classification algorithm, where following features can be used as an input: text converted to bag-of-words text is treated as a stream of vectors, which are pre-trained word embeddings For bag-of-words linear SVM is a good classifier. Read more

April 8, 2018

E-commerce recommendation systems: basket analysis.

Once novelty recommendation systems are used now by more and more e-commerce sites to help customers find products to purchase. For e-commerce business owners these tools facilitate cross-sales. Usage Amazon is one of the most prominent organizations that used recommendations to increase sales. According to fortune.com Amazon was able to increase sales by 29% in 2012 as a result of implementing recommendation system. 35% of Amazon’s revenue is generated by its recommendation engine (source). Read more

© Alexey Smirnov 2023