If you are reading this article it’s obvious that you are interested in text summarization. At the same time, you also want to go a step further and build a text summarization app. But you don’t know how to do it as you don’t have front-end experience. Say hello to Streamlit. With Streamlit, you can build beautiful apps in hours without the knowledge of front-end technologies.
Trust me when I say you can actually build beautiful apps in 30 mins using Streamlit as the title of this article say. It only takes 10–15 minutes to figure out the features you need in your web app by referring to the official documentation. In the next 15–20 minutes, you can put together Streamlit and summarization python code, and your web is ready in under 30 minutes. Congratulations !! …
In the previous article, we have gone through the features of Python int
data type that you are less familiar. Let’s go through the float data type in this article.
Any real value number is represented as float data type such as 0.1, 0.2, 1.234, 56e-4, etc.
The basic arithmetic operations such as addition, subtraction, multiplication, division, floored division, etc. yield the result in float type.
You might be thinking why I am writing about numeric types as there is not much to be explained in detail. Most of us are familiar with basic arithmetic operations such as addition, subtraction, multiplication, and division. But actually, there is more Python offers that we need to understand. So, the goal of this article is to present you with interesting features of Python Int data type.
Any number that can be written without any fractional part is considered as an integer number. For example, 0, 1 , 100, -1001, 4578 etc. are integers.
We all know about addition (+), subtraction (-), multiplication (*) of integers gives the result in an integer. …
Python is an interpreted, high-level, general-purpose programming language. It is continuously gaining popularity over the last many years making it the most popular programming language.
There are hundreds of reasons why people love Python. The most common reasons seem to be — readability, simplicity, ease of use, vast community (growing rapidly), 3rd party libraries (such as Pandas, Numpy, Scikit-learn, etc.). You can do system programming, GUI, Numerical & Scientific programming. It can be used for Natural language analysis, visualization, image processing, machine learning. Sigh! The list is endless.
In this article, we will go through one of the memory management techniques in Python called Reference Counting.
In Python, all objects and data structures are stored in the private heap and are managed by Python Memory Manager internally. The goal of the memory manager is to ensure that enough space is available in the private heap for memory allocation. This is done by deallocating the objects that are not currently being referenced (used). As developers, we don’t have to worry about it as it will be handled automatically by Python.
Reference counting is one of the memory management technique in which the objects are deallocated when there is no reference to them in a program. …
Pkg
is Julia’s built-in package and environment manager. You can treat Pkg as Python’s equivalent of Pip & Conda (we use Pip as a package manager and Conda as both package & environment manager in Python). Julia and Pkg both have their own REPL. Julia REPL can be used for testing the code quickly and also for managing packages & environments while Pkg REPL is used only for managing packages and environments.
What is REPL?
As per Wikipedia “A read–eval–print loop (REPL), also termed an interactive top-level or language shell, is a simple interactive computer programming environment that takes single user inputs, executes them, and returns the result to the user; a program written in a REPL environment is executed piecewise.” …
Julia is a high-level, high-performance dynamic language. Julia has been gaining popularity in recent times as it can also be used for Data Science & Machine Learning. We will go through Julia in detail in a separate article but in this one let’s see how to set up Julia on Jupyter notebook.
I assume that you are aware of Anaconda and Jupyter Notebook and you have installed both on your machine. If you are new to Anaconda, install it from the official page here. If you already installed then continue with step 2.
Model drift is one of the important concepts in the Machine Learning Life Cycle but often most neglected. We don’t see much information about it on the internet as much we see for other topics in Machine Learning.
Imagine that you have built a machine learning predictive model and successfully shipped to production. You assume that the model in production gives the same performance as your validation score and you are getting good results so far. But after some time you will notice that the performance of the model starts to deteriorate. In the worst case, you might have to stop using it until the issue is fixed. …
As stated on the Github page — “SHAP (SHapley Additive exPlanations) is a game-theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions”.
In the previous article here, we talked about TreeExplainer
(which uses the Tree SHAP algorithm) for the Boston House Price prediction dataset. In this blog, let us understand KernelExplainer
(Kernel SHAP algorithm) for model interpretation.
Kernel SHAP is a model-agnostic method for estimating SHAP values for any machine learning model. As you can guess from the heading, the Kernel SHAP algorithm is based on two components — local surrogate (LIME) and Shapley values. It is implemented in the KernelExplainer
method in the SHAP library. …
As cited by authors — The Python Record Linkage Toolkit is a library to link records in or between data sources. The toolkit provides most of the tools needed for record linkage and deduplication. The package contains indexing methods, functions to compare records and classifiers.
Deduplication is the process of eliminating or removing the redundant data from the given data.
Record linkage is the process where the data from one source is joined with data from another source that describes the same entity. …
About