Pol I. Sans . Data Science

After years as a web/graphic designer if decided to come out as computer programmer, specifically, as a Python developer for data science.
I am passionate about writing code to fetch, process, represent and analyse data, to reach unexpected conclusions or validate previous hypotheses.
Throughout my career path I have worked as a team as well as freelance, I have managed projects of dierent subjects, several sizes and durations.
I am very creative, versatile, accurate and always keen to improve myself.

From and in Barcelona, I have also worked in London and in Mexico.

An overview of current (and always improving) IT skills.

Python

. Pandas
. NumPy
. Scikit Learn
. Matplotlib
. Seaborn
. Selenium
. Beatiful Soup
. Folium
...
IDEs:
. Jupyter Notebook
. Pycharm
. Kibana
...

SQL

. MySQL
. SQLite
...
IDEs:
. MySQL Workbench
. DB Browser
...

MongoDB

JSON

XML

HTML5

CSS3

JavaScript

Read only

Some of the Python and/or data related projects I have work as part of a team or solo.

Epidemium 3

Epidemium explores new paths to cancer research with data, a community, and an open science-oriented approach by leveraging technology, interconnectivity, and Big Data.

Jump2Digital

A 2 step Hackathon held in Barcelona. The first stage was a qualifying data science exercise and the second round was as part of a full stack team. I was a member of the wining team.

CryptoPunks

A full data science team project to help an investor select the best Cryptopunks by collecting, exploring, processing and analyzing related data.

Some code examples extracted from various projects.

Data Exploration & Representation

Different codes to review the provided data, normally with representations, to decide the best step forward to process and use the data for further uses.

Example 1. Barcelona Weather

# load Barcelona weather_data
df=pd.read_csv('barcelona_rain.csv',index_col=[0])
dfmain=df[['fecha','tmed','sol','presMax','velmedia','dir','humidity','prec']]
# Boxplot
fig, ax = plt.subplots(figsize=(19,2))
sns.boxplot(x=dfmain["prec"], whis=[5, 95])
# Histplot
fig, ax = plt.subplots(1,3,figsize=(19,4))
sns.histplot(x='tmed', data=dfmain, color='g', kde=True , edgecolor="w", alpha=.3, ax=ax[0],bins=300)
ax[0].set()
sns.histplot(x='sol', data=dfmain, color='y', kde=True , edgecolor="w", alpha=.3, ax=ax[1],bins=300)
ax[1].set(ylim=(0, 300))
sns.histplot(x='presMax', data=dfmain, color='r', kde=True , edgecolor="w", alpha=.3, ax=ax[2],bins=300)
ax[2].set()

Example 2. Cryptopunks

# Load databases
df1=pd.read_csv('newcryptopunks.csv')
df2=pd.read_csv('cryptopunks_traits1.csv')
...
df1=df[~df['venta_usd'].isna()].reset_index()
df1=df1.drop(columns='index')
total,total_vendidos,porcentaje_vendido,precio_medio,precio_max=[],[],[],[],[]
for trt in traitslist:
ct1=df.groupby([trt])['cryptopunk'].count()[1]
total.append(ct1)
ct2=df.groupby(trt)['venta_usd'].count()[1]
total_vendidos.append(ct2)
porcentaje_vendido.append(ct2/ct1*100)
precio_medio.append(df.groupby([trt])['venta_usd'].mean()[1])
precio_max.append(df.groupby([trt])['venta_usd'].max()[1])
cols =['traits','total','total_vendidos','porcentaje_vendido','precio_medio','precio_max']
traits_stats = pd.DataFrame(list(zip(traitslist,total,total_vendidos,porcentaje_vendido,precio_medio,precio_max)), columns=cols
traits_stats=traits_stats.round(2)
...
df.groupby(['total_traits'])['venta_usd'].mean().round(2)
df.groupby(['tipus'])['cryptopunk'].count().round(2)
df.groupby(['skin'])['venta_usd'].mean().round(0)
(df.groupby(['owner'])['venta_usd'].mean().round(2)).mean()
(df.groupby(['owner'])['venta_usd'].count().round(2)).mean()

Links

github exploration samples

IP Map
France Map 1
France Map 2
France Map 3

Data Processing
Examples of data processing code mainly executed prior to a Machine Learning training process.

Under Construction

Data Collection
Collect data by web and even PDF scraping and also from natural language text sources to build data bases.

Example 1
example 2

Machine Learning
Machine learning code examples both Supervised ad Unsupervised.

Example 1
example 2

Structured and Unstructured Databases
Machine learning code examples both Supervised ad Unsupervised.

Example 1
example 2

Data Science & Analytics

Collection, exploration, processing, representation...

Data Exploration & Representation

Different codes to review the provided data, normally with representations, to decide the best step forward to process and use the data for further uses.

Example 1. Barcelona Weather

Example 2. Cryptopunks

Links

Data Processing
Examples of data processing code mainly executed prior to a Machine Learning training process.

Data Collection
Collect data by web and even PDF scraping and also from natural language text sources to build data bases.

Machine Learning
Machine learning code examples both Supervised ad Unsupervised.

Structured and Unstructured Databases
Machine learning code examples both Supervised ad Unsupervised.