Posts de ‘Flávia Missi’

[Flavia Missi] Building and installing lxml with PyPy

Wednesday, September 4th, 2013

Introduction

The major issue my colleagues and I found when we started running some projects with PyPy was the lxml library. It uses Cython, which can run with PyPy if you write your code portably enough. So an effort began to port lxml to use CFFI. This effort can be found on this fork and this is the code we’re going to install from.

Resolving dependencies

We are going to install lxml on a ubuntu 13.04, be warned that installation in OSX might give you serious headaches (<10.8). Start by running the following apt-get:

$ sudo apt-get install libxml2 libxslt1-dev zlib1g-dev

These packages are needed to build lxml (with Python or PyPy).

Bootstraping your environment

You’ll need to have PyPy’s binary to build lxml with, you can folow Andrews Medina’s guide to install it (but it’s in portuguese…)

Assuming you have it installed let’s create a virtual environment (with virtualenv and virtualenvwrapper) to install lxml in:

$ mkvirtualenv lxml-pypy -p /path/to/pypy/bin/pypy

Now clone the lxml fork and checkout to the CFFI branch:

$ git clone https://github.com/amauryfa/lxml.git
$ cd lxml
$ git checkout origin/cffi

Now build and install (double check if you’re in the right virtual environment):

$ python setup.py build
$ python setup.py install

Done!

[Flavia Missi] The proc filesystem

Sunday, July 14th, 2013

Intro

If you use tools such as ps and top then you are already using the proc filesystem even though you never actually ran an ls or opened a file belonging to it. The reason for that is that these tools make use of this filesystem to collect information about processes, and this what this filesystem is for – to store informations about processes.

But what exactly is the proc filesystem?

The proc filesystem is actually a pseudo filesystem used as an interface to access kernel data structures. It’s mostly informative and read-only, but you can actually configure some stuff there.

What kind of informations does /proc stores?

Lets take a look at the filesystem structure to understand what exactly it stores. The following is the result of a ls inside /proc

>$ ls -l
dr-xr-xr-x  9 root    root    1
dr-xr-xr-x  9 root    root    10
# ... omitted output
dr-xr-xr-x  9 root    root    9910
dr-xr-xr-x  2 root    root    acpi
dr-xr-xr-x  4 root    root    asound
-r--r--r--  1 root    root    buddyinfo
dr-xr-xr-x  4 root    root    bus
-r--r--r--  1 root    root    cgroups
-r--r--r--  1 root    root    cmdline
-r--r--r--  1 root    root    consoles
-r--r--r--  1 root    root    cpuinfo
-r--r--r--  1 root    root    crypto
-r--r--r--  1 root    root    devices
-r--r--r--  1 root    root    diskstats
-r--r--r--  1 root    root    dma
dr-xr-xr-x  2 root    root    driver
-r--r--r--  1 root    root    execdomains
-r--r--r--  1 root    root    fb
-r--r--r--  1 root    root    filesystems
dr-xr-xr-x  8 root    root    fs
-r--r--r--  1 root    root    interrupts
-r--r--r--  1 root    root    iomem

Let’s make sense of it: the numbers are directories named by its processes IDs, these directories contains informations of the process it refers, such as the command the process is executing, the command line of the process, the process environment variables, memory mapping information such as libraries that are being used and much more.

It’s worth to note that some of these files’ contents may be null-separated, you can use cat with tr to replace them, e.g.

$ cat 1/environ | tr "\000" "\n"

Now lets run a ls on the /proc/1 directory, this pid always refers to the init process:

>$ ls -ltr
-r--r--r--  1 root    root   cmdline
-r--r--r--  1 root    root   status
-r--r--r--  1 root    root   stat
lrwxrwxrwx  1 root    root   exe -> /sbin/init
-r--r--r--  1 root    root   limits
lrwxrwxrwx  1 root    root   root -> /
-r--r--r--  1 root    root   wchan
dr-xr-xr-x  3 root    root   task
-r--r--r--  1 root    root   syscall
-r--r--r--  1 root    root   statm
-r--r--r--  1 root    root   stack
-r--r--r--  1 root    root   smaps
-r--r--r--  1 root    root   sessionid
-r--r--r--  1 root    root   schedstat
-rw-r--r--  1 root    root   sched
-r--r--r--  1 root    root   personality
-r--r--r--  1 root    root   pagemap
-rw-r--r--  1 root    root   oom_score_adj
-r--r--r--  1 root    root   oom_score
-rw-r--r--  1 root    root   oom_adj
-r--r--r--  1 root    root   numa_maps
dr-x--x--x  2 root    root   ns
dr-xr-xr-x  5 root    root   net
-r--------  1 root    root   mountstats
-r--r--r--  1 root    root   mounts
-r--r--r--  1 root    root   mountinfo
-rw-------  1 root    root   mem
-r--r--r--  1 root    root   maps
dr-x------  2 root    root   map_files
-rw-r--r--  1 root    root   loginuid
-r--r--r--  1 root    root   latency
-r--------  1 root    root   io
dr-x------  2 root    root   fdinfo
dr-x------  2 root    root   fd
-r--------  1 root    root   environ
lrwxrwxrwx  1 root    root   cwd -> /
-r--r--r--  1 root    root   cpuset
-rw-r--r--  1 root    root   coredump_filter
-rw-r--r--  1 root    root   comm
--w-------  1 root    root   clear_refs
-r--r--r--  1 root    root   cgroup
-r--------  1 root    root   auxv
-rw-r--r--  1 root    root   autogroup
dr-xr-xr-x  2 root    root   attr

I’ll cover only the most important files, some of their content you’ll find in process management commands output as I said before, such as ps, others you’ll only find if you come into this directory.

/proc[pid]/task/

This directory contains all threads in the process, one subdirectory per thread. They are named with the id of the thread (tid). Within this subdirectory there is basically the same structure as the one in /proc/[pid], for shared attributes the file contents are the same, for distinct attributes the corresponding files may have different values (e.g. /proc/[id]/[tid]/status)

/proc/[pid]/status

Provides same information as the /proc/[pid]/stat and /proc/[pid]/statm formated for humans.

This file gives informations about the process (/proc/[pid]/stat) and it’s used by the ps command and also provides information about memory usage  (/proc/[pid]/statm)

For information about the columns and fields see the proc manual page.

/proc/[pid]/root

This file is a symbolic link that points to the process’s root directory. Its existence makes container virtualization techniques possible, tools such as chroot make use of it. See the chroot(2) manual for more information.

/proc/[pid]/ns/

Subdirectory containing one entry for each namespace that supports being manipulated by setns, if you’re curious and enjoy some black magic, take a look at the manuals of clone and setns.

/proc/[pid]/coredump_filter

Through this file you can control which memory segments are written to the core dump file when one is performed for the corresponding process. For more information see core(5) manual page.

/proc/[pid]/cmdline

This file holds the complete command line for the process, unless its a zombie, in the case of walkers, this file will be empty.

/proc/[pid]/cwd

Symbolic link to the current working directory of the process. For instance, if you want to find the current working process for a process, run:

>$ cd /proc/20/cwd; /bin/pwd

/proc/[pid]/environ

This file contains the environment variables for the process, null-separated.

/proc/[pid]/exe

This file is a symbolic link containing the pathname of the executed command.

These are some of the files that I find important or just curious under /proc/[pid]/ and I might have forgotten some of them, if you think I did, don’t hesitate to tell me so!

Also, as you might have noticed I simply didn’t addressed the files right under /proc. That’s because I see the information they carry as more important than the former – this is because of my programming background and day-to-day issues. That’s why I am leaving the job to cover those with the manuals (which, BTW, covers the topic very well). Use $ man proc to get a complete explanation on what information each file can give you and $ man /proc/<filename> for more information about a specific file.

[Flavia Missi] Testing a webserver written in Go

Monday, May 21st, 2012

I’ve been recently working on an api that needed to be super fast and made async calls to Canonical’s Juju. For this job, my team and I choosed to use Golang, wich’s aim is to be fast and easy to learn. … Continue reading

[Flavia Missi] Testando signals no Django

Monday, November 7th, 2011

Esses dias precisei fazer uma coisa besta, mas inédita pra mim: testar que um signal foi enviado assim que uma ação fosse tomada. Depois de pesquisar, debugar e ler um pouco do código fonte do Django, decidí seguir o seguinte caminho: Primeiro criei um método para servir como listener: from unittest import TestCase from my_app.signals import [...]

[Flavia Missi] Algoritmos de Ordenação – Insertion Sort

Friday, October 14th, 2011

Oi! Hoje venho aqui falar sobre algoritmos. Estou estudando alguns pontos além do básico e queria compartilhar o que venho aprendendo com quem está começando como eu. Vou começar com uma série de artigos sobre algoritmos de ordenação. Esses algoritmos possuem certa complexidade, então é recomendável que se tenha uma base de conhecimento nesse assunto. [...]

[Flavia Missi] Meu ambiente de trabalho em 7 itens

Friday, October 14th, 2011

Well… Fui intimada pelo Francisco à blogar sobre meus itens e desde então ele não pára de me encher, logo aqui estou eu. Vamos ao que interessa. 1. Linux Yeah! Não vivo sem meu ubuntu. Até posso mudar de distro qualquer dia desses, mas deixar o linux jamais. Me sinto muito confortável e “livre” no [...]

[Flavia Missi] Instalando rvm, Ruby e rails no Ubuntu 10.10

Friday, October 14th, 2011

Bom, esse vai ser um post um tanto quanto curto, mas tive alguns problemas pra rodar rvm + gems + rails, então resolví compartilhar os bugs e as soluções encontradas, acredito que isso vá servir até de consulta pra mim futuramente. So let’s go Instalando o rvm Instalar o rvm é tão simples quanto parece [...]

[Flavia Missi] Descobrir se uma coordenada está dentro de um Polígono

Friday, October 14th, 2011

O problema inicial Esse problema veio à tona quando eu estava resolvendo o problema do Volei Marciano, no SPOJ. Explicando rapidamente do que se trata, o problema pede que posicionemos árbitros, cada árbitro pode observar uma ou mais linhas. A imagem deve ajudar (cada bolinha é um árbitro, as setas indicam a(s) linha(s) que eles observam): [...]

[Flavia Missi] Meu primeiro Giran Siege: Falando sobre a Gem Cells

Friday, October 14th, 2011

Hoje, sexta-feira (06/05) eu apresentei meu primeiro Siege aqui na Giran. Para quem não sabe, um Siege é uma palestra interna, realizada todas as quartas-feiras, aqui na Giran. Essa semana foi uma semana atípica, e como a minha Siege estava atrasada devido à problemas estruturais no meu prédio (fiquei presa no elevador #epicfail) não conseguí [...]

[Flavia Missi] Criptografia: Cifra de César em Python

Friday, October 14th, 2011

Boa tarde! Escrevo esse post com a intenção de explicar de forma introdutória os conceitos por trás da criptografia, falar um pouco da Cifra de César e por fim, escrever um algoritmo em python que coloque em prática os conceitos explicados. Simbora. Criptografia Vou partir do princípio que todos conhecem o conceito básico de criptografia [...]