Searching words with ‘grep’ in multiple files

Searching words with ‘grep’ in multiple files

Linux / Command line

Ezequiel Garcia

One common issue while debugging, refactoring or just programming is when you are searching a word or sentence in a huge number of files and folders.
Several algorithms could be implemented but they always will reach a slow or quick reading of files. One by one until they found a match.

Fortunately GNU provide a powerful tool called ‘grep’. Basically filters the file lines searching a specific word. It uses an algorithm optimized to read files, some say that the real secret it is not to read at all.

This example will show you the matches in the file <filename>.

$ grep "foo" <filename>

Now we go a step ahead by adding some parameters to the ‘grep’ command in order to search in all the files and folders in our location.
$ grep -nHr

Finally this example will show you a list of lines with the file name followed by a number of line and the corresponding line with the match.

$ grep -nHr "frequency"
test/mpeg-freq-test.c:49: struct v4l2_frequency vf;
test/mpeg-freq-test.c:55: vf.frequency = f[cnt % 2] * 16;
test/mpeg-freq-test.c:59: perror("could not set frequency");
doc/README.radio:26: -f Tune to a specific frequency

htop: uptime (!)

htop: uptime (!)

Linux / Command line

Nicolás Sugino

Realizando una revisión de rutina en los equipos de un cliente, noté algo que no me había llamado la atención hasta ahora…

Al abrir nuestro amado y user-friendly htop 

 

Un tipico htop con mas de 40 hilos de ejecucion, se tuvo que hacer un poco de crop en la imagen

En la imagen podemos ver que junto a la no pequeña suma de 322 dias de uptime tenemos un (!). Esto parece ser que siempre estuvo, pero nunca le preste atención, al investigar un poco sobre que es, llegue al codigo del htop donde podemos ver:

static void UptimeMeter_updateValues(Meter* this, char* buffer, int len) {
   int totalseconds = Platform_getUptime();
   if (totalseconds == -1) {
      snprintf(buffer, len, "(unknown)");
      return;
   }
   int seconds = totalseconds % 60;
   int minutes = (totalseconds/60) % 60;
   int hours = (totalseconds/3600) % 24;
   int days = (totalseconds/86400);
   this->values[0] = days;
   if (days > this->total) {
      this->total = days;
   }
   char daysbuf[15];
   if (days > 100) {
      snprintf(daysbuf, sizeof(daysbuf), "%d days(!), ", days);
   } else if (days > 1) {
      snprintf(daysbuf, sizeof(daysbuf), "%d days, ", days);
   } else if (days == 1) {
      snprintf(daysbuf, sizeof(daysbuf), "1 day, ");
   } else {
      daysbuf[0] = '\0';
   }
   snprintf(buffer, len, "%s%02d:%02d:%02d", daysbuf, hours, minutes, seconds);
}

Y ahi nos damos cuenta que solamente era el htop felicitandonos(?) por haber sobrepasado el hito de 100 dias de uptime 😛

Ahorrando tiempo de compilación con make -j

Ahorrando tiempo de compilación con make -j

Linux / Compilar

Sebastián Misuraca

A la hora de compilar una libreria, un programa o el propio kernel de Linux, el comando make por defecto utiliza un sólo procesador lógico (thread) del sistema.
Para acelerar los tiempos podemos utilizar el parámetro -j e indicar la cantidad de procesadores lógicos que queremos utilizar.

Ejemplo:

make -j 8

¿Cómo saber que cantidad de procesadores lógicos o threads tengo disponible?

cat /proc/cpuinfo | grep processor | wc -l

En el siguiente video les mostraré como compilar el kernel con un servidor Dell con doble procesador Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz

Acceso remoto por tunel SSH inverso cuando no hay acceso

Acceso remoto por tunel SSH inverso cuando no hay acceso

Sebastián Misuraca

Cuantas veces han querido tener acceso SSH a un servidor dentro de una red privada de una empresa y no han logrado que el encargado en redes les haga un NAT hacia el puerto SSH (22) del servidor requerido.

A veces por políticas de seguridad o por inconvenientes en la configuración no logran redirigir algún puerto público al puerto 22 del servidor que queremos tener acceso.

Hay una solución, no temas tener que caer en las manos de un TeamViewer sin licencia.

Si Mahoma no va a la montaña, la montaña va a Mahoma.
Hay una posibilidad de hacer un SSH inverso (o túnel SSH).

¿Cómo?

Solo hay que acceder una única vez al servidor (el día de la instalación, o pedirle a alguien que ejecute una línea) y tener desde la pc que estamos intentando conectarnos el servidor de SSH corriendo, como también conocer la ip pública.
Sabiendo

esto, sólo es necesario ejecutar un comando de SSH:

ssh -N -f -R {puerto_destino}:localhost:22 {ip_publica_nuestra}

Ej: ssh -N -f -R 22022:localhost:22 200.142.168.151

Este comando se conecta por SSH a nuestra PC (pide login a nuestra pc) y deja un túnel creado asociado al localhost de nuestra pc, para que si luego nos queremos conectar a ese servidor lo podamos hacer con este simple comando:

Ej: ssh -p 22022 root@localhost

De esta manera tenemos creado un túnel constante a nuestra pc y nos podemos conectar sin necesidad de que haya que redirigir ningún puerto 22 desde el router.

* El servidor debe tener configurada correctamente la salida a Internet.
* Debe estar corriendo el servidor SSH desde la ip pública que estamos saliendo a Internet.
* Dicho script remoto podría quedar configurado al inicio con un  certificado SSH para que la conexión se restablezca sola si el servidor se reinicia.

When clipping a video, mind the GOP and hope the I-frame is an IDR-frame

When clipping a video, mind the GOP and hope the I-frame is an IDR-frame

Nicolás Dato

A picture is worth a thousand words, but when it is a B-frame it may be worth two hundred words.

One of our products ,ViDeus Auditor, lets you clip and join videos, showing a preview before doing the actual clipping. For doing it, we have to understand how an encoded video is composed. We usually work with H264.

When encoding each video picture you can get a I-frame, a P-frame or a B-frame.

  • The I-frame is the easy one, all the information for decoding the picture is within the I-frame.
  • The P-frame is a frame which needs previous decoded pictures for being decoded. So it uses information from old pictures.
  • And the B-frame needs decoded pictures from the past and from the future. So it uses information from old and future pictures.
    For example, you can get something like this:
I B B P B B P B B P

The I-frame can be decoded instantaneously, then the second frame (B-frame) needs information from previous frames (the I-frame for example) and from future frames (like the P-frame).

As the B-frames may need information from the following frames, the stream is rearranged for decoding, in a way such as when a B-frame is being decoded, everything needed is there. So usually the frames are transmitted like this:

I P B B P B B P B B

This results in having a decoding time-stamp (DTS) less than the presentation time-stamp (PTS) in the rearranged frames.

That series of frames can be a GOP, a Group of Picture, a video is composed by a series of GOPs, each GOP starting with an I-frame, this would be three GOPs:

I B B P B B P B B P I B B P B B P B B P I B B P B B P B B P

As the I-frame doesn’t need any other information for decoding, that’s a good point for fast-clipping a video because all the information for decoding is within it; clipping a video in the middle of a GOP (when it’s not an I-frame) will most likely result in a corrupt output for a while until a new full GOP is decoded.

IDR-frame

But, clipping a video at a GOP start will not always result in a clean output.

The I-frame at the beginning will certainly be decoded fine, it doesn’t need anything special. However, the following B-frames and P-frames will probably need previous frames for being decoded correctly. Sometimes those needed frames are within the GOP which it is usefull, but sometimes they are outside the GOP which is bad for clipping, because it means they reference pictures which are before the I-frame where we cut the video, resulting in a corrupt output.

When frames from a GOP reference frames from another GOP it’s called Open GOP. If not, it is called a Closed GOP.

Hopefully, the video was encoded with IDR-frames. Those are a special case of I-frames. Apart from being an I-frame the IDR-frame ensures the following frames will not reference any frame before the IDR.
In a GOP the IDR-frame replace the I-frame, all IDR-frames are I-frames but not all I-frames are IDR-frames.

So, if an IDR is found that’s a good place for clipping, because that frame will be decoded without any other information and all the following frames will not require information from before the IDR-frame.

Next time you want to clip a video, mind the GOP and find an IDR-frame.

Using GCC Intrinsics (MMX, SSEx, AVX) to look for max value in array

Using GCC Intrinsics (MMX, SSEx, AVX) to look for max value in array

Nicolás Sugino

To begin with, you shouldn’t start your new codes focusing on performance; functionality should be the key factor and consider leaving room for future improvements. But well.. after you did your job and everything is working as it should be, you might need to tweak your code a little bit to increase its performance.

First functionality then efficiency
The problem

We want to look for the maximum value in an array, this array is composed of int16_t mono audio samples. The maximum value of the array will be the peak value during the audio interval being analysed, this peak is known as sample peak and should not be interpreted as the real peak of the audio which is the true peak (there is a very good explanation about the differences here).

A basic(?) solution

Ok, we have to look for the maximum value in an array and we focus on functionality, this is quite simple actually…

int16_t max = buff [0];
for(i = 1; i < size; i++) {
>....if(max < buff[i]) {
>....>....max = buff[i];
>....}
}

 The intrinsics

These are a series of functions which implement many MMX, SSE and AVX instructions, they are mapped directly to C functions and are also further optimized with gcc. Most of the instructions use vector operations, and you can work with 128, 256 or 512 bit vectors depending on the architecture and the compiler. There is a very detailed guide here and you can see a full list of the funcions here.

You will need to include the headers depending on what functions you want to call, or just include x86intrin.h, which will include all the available ones. Then, you will need to add the appropiate flag to the gcc compile line, in this specific case I’m using -maxv2. If you want to check the supported functions you may use the following command to list you the corresponding includes that gcc will use.

$ gcc -mavx2 -dM -E - < /dev/null | egrep "SSE|AVX"
#define __AVX__ 1
#define __AVX2__ 1
#define __SSE__ 1
#define __SSE2__ 1
#define __SSE2_MATH__ 1
#define __SSE3__ 1
#define __SSE4_1__ 1
#define __SSE4_2__ 1
#define __SSE_MATH__ 1
#define __SSSE3__ 1

The code

What I’m going to do is compare two buffers of 128 bits, one of them has the max value (initialized to 0, if there are all negative values the result will be wrong)  and the other will be the input buffer, this is done using: _mm_max_epi16() which will compare 8 values (int16_t) at a time. This is one of the reasons why the intrinsics increase the performance.

After going through the whole input buffer, I will have in maxval array the maximum value, but I don’t know the position. For the sake of the example I am doing two redundant things here. First, using _mm_shufflelo_epi16()/_mm_shufflehi_epi16() and the _mm_max_epi16()  I will compare values inside the vector and rotate them so the whole buffer has the maximum value, being 16 bit values I can’t shuffle the whole buffer so the shuffle is done in the high bits and the low bits separately.

Finally I will store the final vector in a int16_t array with _mm_store_si128() and I’ll look for the maximum inside it (I could have done this before, but I wanted to show the shuffle which might be useful if the samples were not 16 bit, and the shuffles were not partial).

int16_t find_max(int16_t* buff, int size)
{
    int16_t maxmax[8];
    int i;
    int16_t max = buff[0];

    __m128i *f8 = (__m128i*)buff;
    __m128i maxval = _mm_setzero_si128();
    __m128i maxval2 = _mm_setzero_si128();
    for (i = 0; i < size / 16; i++) {
        maxval = _mm_max_epi16(maxval, f8[i]);
    }
    maxval2 = maxval;
    for (i = 0; i < 3; i++) {
        maxval = _mm_max_epi16(maxval, _mm_shufflehi_epi16(maxval, 0x3));
        _mm_store_si128(&maxmax, maxval);
        maxval2 = _mm_max_epi16(maxval2, _mm_shufflelo_epi16(maxval2, 0x3));
        _mm_store_si128(&maxmax, maxval2);
    }
    _mm_store_si128(&maxmax, maxval);
    for(i = 0; i < 8; i++)
        if(max < maxmax[i])
            max = maxmax[i];
    return max;
}
Some numbers

I’m going to compare 3 different cases, and find the maximum value in random (pseudo random) arrays of 10000 and 1000000 samples.

  • Using an intuitive for loop as the one shown before
  • Same loop as before but with compiler optimizations
  • Using SSE instructions via intrinsics (sample code).

This table shows the total delay in us (micro seconds) that the different functions take.

Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
ElementsModeAvg. Delay [us]
10000Default650
10000-O2250
10000Intrinsics200
10000Intrinsics w/-O230
1000000Default31000
1000000-O213000
1000000Intrinsics8500
1000000Intrinsics w/-O23500

As you can see, the code gets much faster with the Intrinsics, and even faster with the optimizations.

Afterwords
  • You will see an improvement in most cases, and consider that it gets better when the arrays are larger
  • It gets even better with optimizations (-O2)
  • There is still a difference with smaller arrays (10000 element ones)
  • The key factor is to look for repetitive operations in large arrays
  • You may lose some portability, as some functions may not be available in every microprocessor
  • There are some transition penalties when switching between AVX and SSE, so when mixing both this should be considered

Hope you guys liked the post, please feel free to ask any questions and if I can/know, I will answer you.

Debugging a SIGSEGV Backtrace

Debugging a SIGSEGV Backtrace

Joaquin Fernandez

Before begin my post I need introduce you the term of backtrace, a backtrace is a series of the last function calls in your program (view $man backtrace), with a backtrace you can access the call stack, so, in other words how did you get to that point in your program.

Today working on a code I got a SIGSEGV and the obviously subsecuently crash. After checking the log I found this backlog (which was made using backtrace()):

[17101 XX 12:17:05 (+6)][23417] {sigsafe} src/common.c@1233: SIGSEGV(11), puntero 0xc0 desde 0x7f288f7c79aa
[17101 XX 12:17:05 (+6)][23417] {sigsafe} src/common.c@1252: [bt]: (0) /usr/lib64/twsmedia/libtwsmedia.so(twsmedia_widget_alarm_pool_draw+0x1680)[0x7f288f7c79aa]
[17101 XX 12:17:05 (+6)][23417] {sigsafe} src/common.c@1252: [bt]: (1) /usr/lib64/twsmedia/libtwsmedia.so(twsmedia_widget_alarm_pool_draw+0x1680)[0x7f288f7c79aa]
[17101 XX 12:17:05 (+6)][23417] {sigsafe} src/common.c@1252: [bt]: (2) /usr/sbin/mwconstructor[0x40a2c0]
[17101 XX 12:17:05 (+6)][23417] {sigsafe} src/common.c@1252: [bt]: (3) /lib64/libpthread.so.0(+0x7df5)[0x7f289173edf5]
[17101 XX 12:17:05 (+6)][23417] {sigsafe} src/common.c@1252: [bt]: (4) /lib64/libc.so.6(clone+0x6d)[0x7f288834c1ad]

Viewing this log you know where the problem was … but… Which line is it?

The simplest way to debug(if you dont know this trick) is run gdb and try to reproduce the bug, but not every time its a great decision.

But, wait! what you want to do then?

We will search directly inside the .o of the library for the problematic line… Lets begin:

Use nm to find the function’s start position on .o file

nm src/twsmedia_widget.o | less

You will find something like this line

000000000000cc9a T twsmedia_widget_alarm_pool_draw

0xcc9a is the start line of twsmedia_widget_alarm_pool_draw function

Then, add 0x1680 offset (twsmedia_widget_alarm_pool_draw+0x1680)) to that pos, in this case resulting in 0xe31a

For last, call addr2line, to search for the specific line in the .text section of the object

$ addr2line -j .text -e src/twsmedia_widget.o 0x000000000000e31a 
src/twsmedia_widget.c:3139

And Problem solved! Now we have the problematic line:  src/twsmedia_widget.c:3139

Its important to highlight that this method won’t work in some scenarios like having static functions (because you won’t have the function names in the backtrace).

That’s all for now, see you later!

Codec vs Format: Parte 1

Codec vs Format: Parte 1

Diferencias entre Códec y Formato.

Sebastián Misuraca

Comúnmente, cuando hablamos de formatos de video, se suele confundir el concepto de codec con el de formato .

La diferencia es simple: El codec hace referencia al tipo de algoritmo que se utilizó para comprimir vídeo (o audio, subtitulos), mientras que la palabra formato suele referirse a la combinación de transporte (o encapsulamiento) que se utilizó para almacenar audio y video sumado a los codecsque se utilizaron para comprimirlos.

Formato = Transporte + codecs

Ejemplo: XDCAM es un formato que utiliza el transporte MXF, el codec de video MPEG2Video y el audio en PCM

No hay convenciones para todas las combinaciones posibles de manera que por lo general el formato hace referencia al transporte y comunmente se manifiesta en la extensión del archivo.

Por ejemplo: MP4. Cuando se habla de formato MP4 suele asociarse a que el transporte es MP4, que el codec de video es H264 y el codec de audio es AAC, pero los codecs podrían ser otros también.

Ejemplos de transportes:
MP4, AVI, MOV, MKV, MPEG-TS, OGG, WMV

Ejemplos de códecs:
H264, MPEG4, WMV, MPEG2-VIDEO, AAC, AC3

No todos los transportes pueden contener todos los codecs, y puede suceder que un reproductor reconozca el transporte pero no algunos de los códecs.

Imagínense al transporte justamente como un medio de transporte de carga (avion, tren, autobus) y al codec como los tipos de cargamento que podrían ir en su interior. Por los general son independientes unos de otros, salvo algunas excepciones que determinados transportes son exclusivos para determinados codecs como por ejemplo el FLV

AVI y MP4 son transportes.

A mi entender AVI podría ser un tranporte antiguo, limitado e inseguro como este tren:

MP4 podría ser algo así:

¿WMV es un codec o un formato?

WMV es el nombre de un formato pero también es el nombre del codec ambos creado por Microsoft. Como formato solo puede contener en su interior codec de video WMV y codec de audio WMA. El codec WMV de video a su vez puede estar contenido dentro de un formato AVI

Cuando alguien dice que tiene un archivo de formato MP4, en realidad no se sabe que codec de video va a tener en su interior, hay una cantidad de codec soportados como MPEG2VIDEO y H264, pero en realidad sólo está haciendo referencia al tranporte.

 

“Es muy importante saber entonces que cuando hablamos de formato generalmente hablamos de transporte, y que dentro del mismo existen videos y/o audios comprimidos con algún codec.”