Latest Publications

Multimodality in the Trecvid Evaluations

During this month NIST (National Institute of Standards and Technology) is organizing a set of evaluations called Trecvid in order to test several technologies related to video processing. The Trecvid evaluations (http://www-nlpir.nist.gov/projects/trecvid/) are long-lived yearly events which started in the 90’s with the Trec evaluations on Text recognition, which focused on the processing of text for the tasks of information retrieval. On 2001 the video component was introduced which from 2003 became its own evaluation, which has become very prominent among researchers in the video/image fields.

The trecvid evaluation proposes a set of tasks to be done by the participants using a common benchmark database. Participants are to run their systems in these databases and return a set of answers to the task to the NIST organizers, who evaluate the answers and release later the results and how people did. Results and system descriptions are later explained in a Workshop which joins all researchers having participated. This is a perfect place to compare the different technologies and new ideas in topics of interest both to industry and academia. Trecvid (like many other evaluations by NIST) do not aim at being a competition, but a framework to evaluate technology in a fair way, with the same conditions.

Personally, I had participated in the past to several (3) evaluations from NIST called RT (Rich Transcription evaluations). In these, the data was composed of audio recordings from radio/TV broadcasts or meeting room recordings and the task was 1) to decode what was said in the recordings, and 2) to find how many people were speaking and find where in the recordings each was speaking. The RT’s are very well known within the speech community and have been running for a few years.

With my recent broadening of interests towards multimodal processing I became aware of the Trecvid evaluations, and in particular, of the video copy detection task, whose objective is to find video copies in a video database. In particular, this year we have been given around 400h of reference videos (of many lengths and sources, some of them in black and white and some in other languages than English). In order to test the systems we have been given a set of queries composed of shorter videos which may/may not contain a segment (all the query can be the segment, or it can just be a piece of it)  which is a transformation of a segment existent in the reference materials. The transformations are many possible, with different degrees of degradation in the audio and video parts. In the evaluation there are 3 different deadlines, the first one (just passed) is the video-only submission, where the queries are composed only of the video part, no audio. The second deadline is the audio only (August 28th) and the third part is the audio+video submission (1st October). This year is the first where the audio+video analysis is a mandatory submission for all teams.

I find it very interesting that Trecvid and NIST are trying to impulse research in the audio modality within this evaluation from this year on, and I hope this will lead to future years where audio will be at the same level of video, having the audio-only modality be also mandatory for participating labs. I agree that this is not an easy task, as many (or most) of the participating teams are composed of video-only researchers. I think, though (and I can talk with some experience) that getting into the opposite modality in terms of research (from audio into video for me, and from video into audio for most of Trecvid participants) is a very enriching activity, with many new ideas coming from the application of well established techniques to the new modality, which have never been explored because usually the audio and video groups never mix, and sometimes are even in different physical locations.

I see this needed fusion and understanding like the one I was involved in while finishing my EE studies. I knew at that time that I wanted to pursue a career as a speech engineer and therefore searched for the opportunity to join classes in Universities in my city where linguistics classes were taught, in order to get in touch with some of the people and knowledge that I would have to work with later on when working on speech recognition systems or Text to speech applications. This was very enriching personally and professionally. As some professor used to say, I tried to “bridge the gap” between linguists and engineers.

With the audio and video communities I think we should try doing the same, and the Trecvid evals could be on point where both areas get together and discuss on common problems. We can all benefit a lot from multimodality, and definitely the technology will also improve dramatically when we look at the problem from orthogonal perspectives. In order to do so, we need many more activities where audio and video come together into the same umbrella, but also we need some help getting people from the two fields interested in each other. It is not enough to get a European project together where each one does their thing and do not talk to each other. We need real collaboration where algorithms and ideas flow both ways. One good way to start would be to create real multimodal databases where annotations would be of quality both for the audio and the video part.

I am very happy working in a multimodal area and I am very glad I found Trecvid and the video copy detection task, the perfect place where to exercise my ideas.

Christmas is here!

Dear readers,

many months have passed by since my last post. It was right in the middle of my trip to India, from which I have now fully recovered. Some day I will write a little more about my final thought regarding that interesting country.

Today I am enjoying my first day of vacation from work and preparing for the Christmas days by arranging my house and thinking about the presents I need to get. I am also sending a few christmas portcards by email with a video attachment. I just discovered this new way of greeting Christmas from a coworker and I must admit that it is a great idea, now that most everybody has access to broadband internet and appreciate novel ways of greeting the holidays, other than a note, a picture or a picture with music.

I am also slowly working on my new website cover and content. For now this blog appears as the cover but as soon as I can finish it you will have access to my pictures, my publications, PhD thesis online and, of course, this blog. So keep posted.

Well, let’s keep the stuff going, merry Christmas and happy new year 2009!

Shopping pressure in India

When I got to India I though that the guides were telling me to be careful with people trying to speal money from me and to trick me just because I was a tourist. I got here a little afraid of that and I have found a totally different story that I want to share with you.

Pressure here by people trying to sell you a service or good is enourmous. I had never experienced people walking next to me for 5 minutes trying to sell me something, or for me to sit on their rickshaw to take me some place. I have to also admit that I have not felt in danger at any time that I would get my wallet stolen, or felt that they were looking for it even (I cary it in a secure place, nonetheless).

Bu let’s focus on the shopping pressure. Imagine you are waling down las Ramblas and every gifts shop has 2-5 people sitting outside and jumping on to you offering you to enter their shop becauyse they have very nice “something” or very cheap “whatever”. They sometimes list you all things that they can sell you (that must be ranked according to their top sells) and if they see that you make a different face when saying any of them they start with the pricing drop, to try to attract your attention. Furthermore, if they see you’re looking at anything they quickly take it to you and start with the pricing game.

My friend has the theory that their initial price can be lowered up to 30% that value. Getting lower than that is difficult and will take you more time, but we have done it :) I have to say that shopping in India has become like a sport for us, in which you end up with tons of things that probably won’t have space to put in my appartment (even if it’s new, read my previous posts) and won’t have space in the luggage.

Finally, I cannot conclude without telling you about comissions. Anywhere I have asked for information to someone they have given me an interested answer, either directing me to a place where they will get a comission or telling me that where/what I wanted was not possible, and telling me that they had another option, which was very cheap, of course. If you go shopping and can avoid these people from taking comissions you’ll be in a better position for barghaining a good price (which will not include their comission).

A grades Rasgos

10 days have already elapsed of this journey through the north of India and just now I have a “relaxing” evening in Agra (home of the Taj Mahal) to write an overall impression of the trip so far.

I will possibly talk about many of the particulars in other posts, but this means to be an “overall  impressions” entry, o “a grandes rasgos”.

When we got to India the welcoming was a bit harsh. We landed on July 14th at 11PM. Given that July 15th was national holyday and that the president had to give a talk in Delhi’s Red Fort, we got a bit stuck in the airport with people telling us that they would not be able to take us to the hotel (Within the restricted area) and that the hotel had shut down for the night.

We got serious and after being in a taxi for 2:30 hours we got to what seemed to be a hotel. The hotel turned out to be very nice inside, eventhough we had to get in through some small doors and avoiding people that were sleeping in the street.

The next day we met 2 girls from Mataro and their mother, and we embarked together on an 8 days trip through the Rajstan by taxi. For this we had the company of 2 drivers that did not speak much english at all (otherwise from what the tourist office had told us). 

We have just finished our toor and landed in Agra. This has by far the hottest climate so far. Today me and my comanion have suffered a bit from a heat stroke as we walked though the Taj Mahal area.

Tomorrow we are heading towards Delhi but just doing a pitstop in our way to Rishikesh, town of Yoga and ayurvedic massages. I’ll tell you more about that later.

De paseo por… India

Hi friends,

this si about to start… in two days I’ll be heading off to India for 15 days of vacation. I have been looking forward to it for a couple of months now, and although one of the three murketeers unfortunately cannot make it, we are still in good mood and looking forward to Ankara, Delhi, and a cultural shock.

In fact, in order for the trip to be just a cultural shock, and not an illnesses shock, I have finally let myself down to the vaccination center in Barcelona. I am not a good friend of needles and I had been putting off the trip to the doctor (even considering not going at all), until today I was talking to a friend at work and convinced me that this was not very good.

I am happy to have gone and in fact, I left with 3 shots and no pain at all (except for sosme pain in my wallet). Except for some time waiting, I got an excellent attention and very professional people who answered all my questions and offered me more information that I had asked for about what to buy in terms of medicines, and what to do and not to do in there.

So I would just say, listen to your friends all the time, but leave to professionals whatever comes to health issues. They are the only ones that really can tell you what is really convinient for you. 

My new flat

Dear reader,

it’s been a while (over 6 months) from my first post til this one, and I felt that enough was enough, that I need to focus and write a bit more. One of the reasons why I have not been very prolific in writting is because I just got my new flat, which took some time to find and has taken (less) to settle in.

Now I can say that I am an owner! of a morgage, al least.

I will leave the description and some pictures about the flat for another time. Now I would like to talk about the “moving in” issue. I call it an issue as it is not an easy thing, mostly when the flat is not a new one (like mine) and you have a bunch of things to change from an old flat to this one.

In my case the actual “moving” was pertty fast. I had to be out of my old 30m2 rented appartment by July 31st, and I only got to sign the morgage and get the keys by the 30th afternoon. Me and my girlfriend started carrying boxes on that evening and we finished the big stuff  with my dad on the 31st in the morning. We can definitely say that everything was finalized by the 31st afternoon, as I needed to leave for Valencia, where I had to be on the 1st :)  So we carried a full bunch of boxes into the livingroom in less than 24 hours, with 3 people and lots, lots of sweat…. and a parking violation ticket.

Getting all these boxes into their places is not as easy as I thought. First you need to make sure that the place you’re going to place something is clean before you do so, which sometimes hapened not to be, and therefore there is some time spent in trying to get that spot or dust out of there. I am alergic to dust, which makes it even funnier as I could not stop sneezing during the whole days I have been putting order…

Once the “recipient” is clean, it is time to put the content in there. In doing so it is not easy job either, as both the new and old flats are not the same (one is 3.5x the size of the other). There is normally a lot of thinking on where I want my things stored, which is normally contrasted and discussed on an item basis with the loved ones (in my case it was not so bad).

Finally, today, I can say that there are not more full boxes in my livingroom (there are some empty ones, waitin for a trip to the recicling container downstairs). The though part has finished, but now another paret starts: the rest of the cleaqning and the improvements part. I have many ideas on things that I can place in the flat to make it more livable, and there is still a lot to clean (including the livingroom floor) which will certainly be adressed next.

One thing is true, it makes a big difference to clean and put things in order in an appartment that is yours than one that is just rented. I have spent the last 10 years renting  places, and it feels more important now.

Overall, I love my new flat!  

This is cool

Hi reader!

I would have never promised anyone that I would be starting a blog this year. Indeed, this was not one of my year’s resolutions, but here I am :)

Lately I have discovered how important it is to get our individual word out. As insignificant it might seem, it can help someone, and it can also help me. In fact, by writting things, or talking them out loud, I get them more clear in my mind.

This blog will not be about technical stuff only. I am a EE/Telecos and I work on speech processing, but not all in life is work, or is it?

The blog will neither be about politics. I am not very political myself, therefore although sometimes I might reffer to something that catched my attention, I will only take the side that aligns with what I think, and not what whatever political party says.

Also, the blog is not intended to be about traveling (cause I am mostly stationary right now), or food (I like to eat it, not to talk about it) or culture, or economy, or…

So, what can I talk about? a little bit of everything and a lot of nothing. This is why I titled it “Estoy de paseo”, which it means that I am taking a walk around many topics, whichever catched my attention and I felt like writting about a bit.

Ah, and who am I? I am Xavi, living in Barcelona (Spain) as for now, working for a big telco company and currently single.