Francesco BonchiSenior Research ScientistYahoo! Research Barcelona |
| HOME PUBLICATIONS Y!BCN SEMINARS PROFESSIONAL RESUME CONTACT |
Influence Maximization: a data-based approach [October 2011]
While most of the literature on Influence Maximization
has focused mainly on the social graph structure, in our new paper (accepted for VLDB'12 [bibtex], together with Amit Goyal and Laks V. S. Lakshmanan)
we proposed a novel data-based approach, that directly leverages available traces of past propagations.
Our Credit Distribution model estimates influence spread by exploiting historical data, thus avoiding the need for learning influence probabilities, and more importantly, avoiding costly Monte Carlo simulations, the standard way to estimate influence spread. Based on this, we developed an efficient, effective, and scalable algorithm for the Influence Maximization problem. Beyond this main contribution, the paper provides several interesting insights in the Influence Maximization problem. Check it out! MY KEYNOTE TALK AT WI-IAT 2011 [August 2011]Please find below the slides I will use in my keynote talk this morning at WI-IAT 2011 in Lyon. The talk is entitled Influence Propagation in Social Networks: a Data Mining Perspective (pdf). If you want the powerpoint, please e-mail me .SPARSIFICATION OF INFLUENCE NETWORKS [June 2011]SPINE (SParsification of Influence NEtworks) is a new algorithm for discoverying the "backbone" of an influence network. Given a social graph and a log of past propagations, we build an instance of the independent-cascade model that describes the propagations. We aim at reducing the complexity of that model, while preserving most of its accuracy in describing the data. The paper will be presented next August at the KDD 2011 conference. #viscous_democracy #castelleres @CACM [June 2011]Our paper Viscous democracy for social networks,
(together with Paolo Boldi, Carlos Castillo and Sebastiano Vigna) has been published in the virtual extension of the June 2011 issue of Communications of ACM. LINK PREDICTION: A RULE-BASED APPROACH [Apr 2011]
Recently I've seen a nice survey on Link Prediction covering a lot of different approaches, but overlooking our frequent pattern based approach introduced in:
Bringmann et al. Learning and Predicting the Evolution of Social Networks IEEE Intelligent Systems ©IEEE 25(4):26-35 - Jul/Aug 2010. Briefly, in this paper we extend our previous work on Mining Graph Evolution Rules, showing how to use those rules for Link Prediction. Our method is quite different from standard Link Prediction, and leverages the history of the network evolution till now to predict future evolution. Our frequent pattern based approach not only predicts arrival of new edges among pairs of existing nodes, but also allows to predict edges among one existing node and a new node that is not yet part of the graph. NEW BOOK ON PRIVACY AND MINING [Jan 2011]
Check out Elena's interview with the IEEE Computer Society. TAXOMO SOFTWARE RELEASED [Jan 2011]I am glad to announce that today we released the TAXOMO sequence mining software under a BSD license. TAXOMO is a data-mining tool for sequences. It takes as input a set of sequences and a taxonomy, and generates a succinct description of the sequences (specifically, a Markov chain with lumped states). The input sequences may represent any kind of data, e.g.: trajectories on a map, web pages visited by a user, etc. The taxonomy should be defined over the states in the sequences. In the case of a map, for instance, they can be regions and sub-regions for the points in the map. In the case of a web site, they can be categories and sub-categories for the pages. Taxomo was developed at Yahoo! Research Barcelona, and it is described in:Bonchi et al. Taxonomy-driven lumping for sequence mining Data Mining and Knowledge Discovery Journal ©Springer 19(2):227-244 - Oct 2009. For more information and download, see: http://taxomo.sourceforge.net/ ![]() Example of application of TAXOMO to a dataset of trajectories. (a) Road network used to generate the trajectories dataset. (b) Markov model of the whole dataset at the leaves level (i.e., the original grid with 1,024 states). (c) Markov model of the whole dataset with 175 states, found by Taxomo. Arc thickness represents the probability of the transition.. MEME RANK [Dec 2010]More animations and information about Meme Rank code here. ICDM 2010 PANEL [Dec 2010]I was a panelist at ICDM'10. The panel was entitled "Top-10 Data Mining Case Studies". The other panelists were: Ashok Srivasta, Bart Goethals, Rayid Ghani, Christos Faloutsos, Longbing Cao, Osmar Zaiane, Rong Duan, Paul Beinat, and Geoff McLachlan. The panel chairs were Gabor Melli and Xindong Wu.YAHOO! CLUES LAUNCHED [Nov 2010]Yahoo! launched Clues, a service based on science from Yahoo! Research Barcelona. Yahoo! Clues lets you explore how people are using Yahoo! Search. When you enter a word or phrase in the "Search Term" field and click Discover, you’ll see information about that search term’s popularity over time, across demographic groups, and in different locations. You can also enter a second search term in the "Compare With" field. This will show you information on both search terms, side by side.
Among other things, it allow users to browse a real query-flow graph:
|
ME ON TWITTER
RECENT PUBLICATIONS
|
| WEB DESIGN CREDITS |