back to UN POCO Contact: Pol I. Sans

Data Science & Analytics

Collection, exploration, processing, representation...

Work in Progress
Intro_

After years as a web/graphic designer if decided to come out as computer programmer, specifically, as a Python developer for data science.
I am passionate about writing code to fetch, process, represent and analyse data, to reach unexpected conclusions or validate previous hypotheses.
Throughout my career path I have worked as a team as well as freelance, I have managed projects of dierent subjects, several sizes and durations.
I am very creative, versatile, accurate and always keen to improve myself.

From and in Barcelona, I have also worked in London and in Mexico.

IT Skills_

An overview of current (and always improving) IT skills.

Python
. Pandas
. NumPy
. Scikit Learn
. Matplotlib
. Seaborn
. Selenium
. Beatiful Soup
. Folium
...
IDEs:
. Jupyter Notebook
. Pycharm
. Kibana
...
SQL
. MySQL
. SQLite
...
IDEs:
. MySQL Workbench
. DB Browser
...
MongoDB
JSON
XML
HTML5
CSS3
JavaScript
Read only
Projects and Contributions_

Some of the Python and/or data related projects I have work as part of a team or solo.

Epidemium 3

Epidemium explores new paths to cancer research with data, a community, and an open science-oriented approach by leveraging technology, interconnectivity, and Big Data.

Jump2Digital

A 2 step Hackathon held in Barcelona. The first stage was a qualifying data science exercise and the second round was as part of a full stack team. I was a member of the wining team.

CryptoPunks

A full data science team project to help an investor select the best Cryptopunks by collecting, exploring, processing and analyzing related data.



A bit (un poco) of code_

Some code examples extracted from various projects.

# load Barcelona weather_data
df=pd.read_csv('barcelona_rain.csv',index_col=[0])
dfmain=df[['fecha','tmed','sol','presMax','velmedia','dir','humidity','prec']]
# Boxplot
fig, ax = plt.subplots(figsize=(19,2))
sns.boxplot(x=dfmain["prec"], whis=[5, 95])
# Histplot
fig, ax = plt.subplots(1,3,figsize=(19,4))
sns.histplot(x='tmed', data=dfmain, color='g', kde=True , edgecolor="w", alpha=.3, ax=ax[0],bins=300)
ax[0].set()
sns.histplot(x='sol', data=dfmain, color='y', kde=True , edgecolor="w", alpha=.3, ax=ax[1],bins=300)
ax[1].set(ylim=(0, 300))
sns.histplot(x='presMax', data=dfmain, color='r', kde=True , edgecolor="w", alpha=.3, ax=ax[2],bins=300)
ax[2].set()
# Load databases
df1=pd.read_csv('newcryptopunks.csv')
df2=pd.read_csv('cryptopunks_traits1.csv')
...
df1=df[~df['venta_usd'].isna()].reset_index()
df1=df1.drop(columns='index')
total,total_vendidos,porcentaje_vendido,precio_medio,precio_max=[],[],[],[],[]
for trt in traitslist:
ct1=df.groupby([trt])['cryptopunk'].count()[1]
total.append(ct1)
ct2=df.groupby(trt)['venta_usd'].count()[1]
total_vendidos.append(ct2)
porcentaje_vendido.append(ct2/ct1*100)
precio_medio.append(df.groupby([trt])['venta_usd'].mean()[1])
precio_max.append(df.groupby([trt])['venta_usd'].max()[1])
cols =['traits','total','total_vendidos','porcentaje_vendido','precio_medio','precio_max']
traits_stats = pd.DataFrame(list(zip(traitslist,total,total_vendidos,porcentaje_vendido,precio_medio,precio_max)), columns=cols
traits_stats=traits_stats.round(2)
...
df.groupby(['total_traits'])['venta_usd'].mean().round(2)
df.groupby(['tipus'])['cryptopunk'].count().round(2)
df.groupby(['skin'])['venta_usd'].mean().round(0)
(df.groupby(['owner'])['venta_usd'].mean().round(2)).mean()
(df.groupby(['owner'])['venta_usd'].count().round(2)).mean()