Pandas: Efficient Use of Pyconomist's Favorite Tool

aka: Data analysis in Python - come in, don't get lost

"I think the problem, to be quite honest with you, is that you've never actually known what the question is", says Deep Thought in The Hitchhiker's Guide to the Galaxy. The Python ecosystem for data analysis has flourished in the last years, but as a consequence, choosing the right tool for the job is not always simple. This talk will briefly draw a landscape of some of the most relevant libraries, and of the different puroposes they pursue. We will then take a look at the daily workflow of a "pyconomist"

The tribes of Python and PostgreSQL: How Open Source and their communities shaped my life.

Python saved the day in one of my development project at the beginning of the century.

Since then, I have been involved within the Python language and
its community. For  multiple years I was allowed to contribute to the EuroPython conference, 
as well as helping Python conferences in United Kingdom and Ireland. 
 
Being a part of the Python community shaped a big part of my professional and personal life.
The long term of involvement also helped me to see the evolving and changeing of the Python communitie.
Shareing episodes, experiences from language and communtiy I will try to inspire you to 
harness the power of open source to create a rewarding live.

Automated Valuation Model: From Data Exploration to Results Validation

Paulius will take you along through our data universe and show how he uses Python to make sense of an endless stream of data and how Python is being used in predictive analysis. Step-by-step he will reveal the secrets of big data analysis: from the exploration and quality of data to the interpretation and validation of results.

Journey into Monitoring: Paradigms and Stacks

Monitoring is not a trivial task. And there's no "right" or "wrong" way, it always depends...

I'll cover main paradigms:

  • reactive monitoring;
  • proactive monitoring;

Most popular stacks:

  • Graphite + Grafana
  • ELK (ElasticSearch + Logstash + Kibana)
  • Prometheus
  • Nagios

And some short guidelines about alerting. Purpose of this talk is mainly show whole landscape, educate about pros and cons of each approach, and give some brief look how they work at low level.

Future Pythonic Web: ASGI & Daphne

The web evolves. There is an increasing demand for asynchronous programming. More and more protocols that do not follow the pattern of HTTP-style request/response cycle (most notably, WebSocket). There are brand-new concepts. Gladly, so evolves Python to answer the call.

In this speech we will examine the part of it that deserves your very attention: ASGI (Asynchronous Server Gateway Interface, the spec that is currently in-progress, but is now mostly complete) and Daphne (ASGI server).
We'll go through the stuff you're already familiar with: website on templates, REST API, WebSocket/HTTP/2. We'll see how it feels in action on Django/Flask cases. We'll also discuss distribution matters (e.g. Docker) and other nuances.

Building a Startup with Python, React and AWS

In January I launched a startup built on a Python, React and AWS stack. It took some learning curve to get there, which is why I would like to share some of the quick wins with you. I will explain why using React and Python is a great choice, especially as a small company, and you will see how to quickly and easily get a new React project up and running. You will also learn how to quickly get started with AWS and you get a brief overview of some of the most commonly used AWS features for a standard Python/React project, to dramatically cut down your research time.

Meet Exa, AI Driven Junior Data Analyst

TBA

Writing Tests When The Code is Already There: Golden Master Technique

Inheriting someone else’s code is scary. It might be ugly, unreadable and the intentions are not always clear. Especially if there are no tests. How to deal with it? Characterization tests come to the rescue.
There is no doubt, that test coverage brings safety when refactoring or adding new features to code. However, legacy code tend to be untestable and often we’re stuck in a vicious circle were to test, we must refactor, and to refactor, we have to write tests. The purpose of characterization test known as Golden Master is to minimize the refactoring and maximize the safety in these situations.
In this session we will learn when and how to apply Golden Master and try to implement it ourselves.

Python in Blockchain

Last year, with Bitcoin's all time hight and thousands of ICOs programming for Blockchain became quite hot topic. I'd like to give short overview of Bitcoin and Ethereum and what kind of development is possible there. As well I'll share what kind of python libs are available there and in what kind of projects is python most popular in Bitcoin and Ethereum communities.

You can look at this talk as to short introduction possibilities for Python developers in Blockchain world.

 

Creating Solid APIs

Increasingly, our apps are used not by humans but by other apps - via their APIs. Thus it is increasingly important that your APIs are well-designed and easy to consume for other developers.
I will share tips and good practices on authentication, versioning, documentation, response structure, and why it all matters.


Adding a few API endpoints to your application for internal consumption is easy. Creating APIs that other developers will love to use is a much harder problem.
You'll need to think about solving variety of topics such as versioning, authentication, response structure, documentation and more. There are existing good practices for each of them, but often developers who haven't done a lot of API work aren't familiar with them.

My talk will show how to build on top of Django and DRF and find reasonable solutions for those problems.
I will talk about JSON API, OAuth2, and other technologies and show how they fit into the puzzle.
Benefits of standardized response structure, as well as auto-generated documentation will also be discussed.

I'll introduce OAuth2, discussing when it is a good choice and when not, as well as some trickier parts of it.
Next we'll look at why a standardized response structure such as JSON API makes lives of 3rd-party developers easier. We'll then move on to versioning and how you can change your API without breaking all existing apps. And the talk wouldn't be complete without looking at documenting your APIs and why the docs should be auto-generated.

7 Years of Using Python for Data Science DevOps

aka: Data Science, Python, Production – Choose Three

Data science, big data or AI are buzzwords no one can escape in the last years. With a grain of salt, in the past it was sufficient for many companies to just buy a Hadoop cluster in order to show that one is really doing “this data stuff”. Times are changing and the companies slowly realize that data buried in a datacenter is worth nothing.

The biggest value of data science lies in data-driven automation of business decisions and therefore as an active part of a company’s value stream.

This implies that ultimately, data science is about building enterprise grade applications. While Python evolved into the de-facto standard programming language in the data science universe, it is still not the language of choice for building enterprise grade, mission critical software. In this talk I will briefly show you why. Luckily, despite the big tensions between the three “opponents”, you can bring them together by applying the principles of the DevOps methodology.

At Blue Yonder, we have more than seven years of (sometimes painful) experience delivering and operating data science applications as-a-service for our customers using python only. In this talk I will share important lessons learned, how we deploy, how we test, how we monitor and how we “crunch the numbers”. In short, I will take you for a walk through our “pythonic enterprise grade data science delivery pipeline”.

Distribution of Python Software

In the talk we will cover how to handle packaging and distribution of python projects.

When distributing software we need to be sure, that the package/container/procedure works in order to deliver the value of the product.

The talk will cover the basic of Virtualenv, pip installer, wheels. We will take a look at setup.py files and how to distribute a package to pypi. We will discuss solutions with dependencies of C/C++ extensions.
 

How People are Learning Python - My Experience from 14 Years of Python Teaching

TBA

Text Analysis With SpaCy, NLTK, Gensim, Skearn, Keras and TensorFlow.

The explosion in Artificial Intelligence and Machine Learning is unprecedented now - and text analysis is likely the most easily accessible and understandable part of this. And with python, it is crazy easy to do this - python has been used as a parsing langauge forever, and with the rich set of Natural Language Processing and Computational Linguistic tools, it's worth doing text analysis even if you don't want to.

The purpose of this talk is to convince the python community to do text analysis - and explain both the hows and the whys. Python has traditionally been a very good parsing language, aruguably replacing perl for all text file handling tasks. Reading files, regular expressions, writring to files, crawling on the web for textual data have all been standard ways to use python - and now with the Machine Learning and AI explosion - we have a great set of tools in python to understand all the textual data we can so easily play with.

I will be briefly talking aboubt the merits, de-merits and use-cases of the most popular text processing libraries. In particular, these will be spaCy, NLTK, gensim. I will also talk about how to use traditional Machine Learning libraries for text analysis, such as scikit-learn, Keras and TensorFlow.

Pre-processing is the one of the most important steps of Text Analysis, and I will talk more about this - after all, garbage in, garbage out!

The final part of the talk will be about where to get your data - and how to create your own textual data as well. You could analyse anything, from your own emails and whatsapp conversations to freely available British Parliament transcripts!

Online Machine Learning - When it's Infeasible to Train Over The Entire Dataset

The talk will be an introduction to online machine learning, a method of machine learning that needs only a small fraction of data to update model parameters as opposed to the commonly used batch learning that requires the full data set to make a parameter update. 

This method can be used when it is infeasible to load the full dataset in memory or when the use of adaptive models is needed.

Multiple online machine learning libraries are available for different technology stacks, we will look at these from a Python user perspective and focus more on Vowpal Wabbit - a fast out-of-core machine learning system library.
 

Talking to Amazon Alexa using Python

I will be talking about:

- What is an Alexa Skill
- Why knowing how to develop an Alexa skill is important
- How to make an Alexa Skill using Python

More than 20 million Alexa devices are getting sold every quarter, and its becoming a part of our life day-by-day. But, still we have less than 20,000 skills developed for the Alexa platform as of yet.
I see that as an opportunity, an opportunity to harness the power of an amazing platform.

And this talk is exactly designed for that, to give you a glimpse of what it is all about and how can you yourself teach Alexa to talk to you.

By the end of this talk, you will know what is all the craze about Alexa. What exactly is it! How to teach Alexa ‘skills’, and customise it for your use-cases. And the crux of the talk, I will give a live hands-on demo of how to do all this using Python.

Jupyter Notebook: Between Interactive Document and Specialized Web Application

Jupyter Notebooks (previously IPython Notebooks) has already become the product of choice for many scientific and education projects, ranging from particle physics and cosmology to data science and all kinds or tutorials. For majority of notebook users, it works like (semi-)interactive document: one may combine descriptive text (with Markdown formatting), executable code, and computation results, including plots and images and save the notebook for future non-interactive work. On the other hand, Jupyter notebooks support various kinds of interactive widgets, ranging from geographic maps to complex dashboards, resulting in specialized web interface. In my talk, I will explain how these two kinds of usage are reflected in the notebook architecture and discuss how to select the approach for your product.

 

Functions: What a Concept

We will start with basics. What is a function? What is a programming language's function? What is a Python's function? And than we will start digging deeper. What is a bound function? What is an unbound function? What is a closure? How a function is represented in CPython C code? And so and so on until time runs out.