Monday, October 26, 2009

Going Parallel in R

I recently had the need to get some parallelism going in R. I was doing some largeish-scale monte carlo and markov chain monte carlo simulations of latent space network models for NIPS. Anyway, my needs were quite simple, basically, I just need lots of repetitions and I wanted to easily spread the repetitions across a few processors/cores, all on a single machine.

I found a number of options and started out using the multicore package. This worked wonderfully and easily when I tested it out on my desktop and ported my code over to use it. Of course, I neglected to notice the not-so-small print on the manual:
SystemRequirements: POSIX-compliant OS (essentially anything but Windows)
Now, I had never really encountered system-specific packages for R before, so this kind of caught me by surprise when I scp'ed things over to our big ole' windows machine to run and found that I couldn't install multicore. Well, eventually wised up and replaced mutlicore with snowfall when on a windows system. Beginning to think I should have just used foreach, but that's another story.

Editing Mendeley Citation keys

Mendeley is a nifty new tool for organizing and sharing academic references. I've been test driving it a bit, mainly using it to import files that I then transfer over to JabRef. This was working out alright, since Mendeley allows you to export to bibtex, which is the backend for JabRef and what I ultimately end up using to write my papers. I've been looking for a way to streamline this process, since I like Mendeley's interface a lot better and it has better search features, e.g. full-text search of all attached pdfs.

Unfortunately, Mendeley doesn't use bibtex for its backend, but it does use SQLite and will even let you export your database as a SQLite zip file. I can't seem to figure out where Mendeley stores my user files so far, so exporting seems the way to go.

Using the SQLite Database Browser, I managed to load up my Mendeley database. Things seem pretty obvious, with most of the things you'd want to futz with in the "Documents" table. For now I just removed all the citationKey values and generated my own the way that I wanted them, but it's nice to know that it's not too miserable to get at the raw data from Mendeley.

Note: You can also restore an edited zip of sqlite data, but you have to have it all flat, exactly the way that Mendeley exported it.

Tuesday, March 24, 2009

Analysing Spatiall Embedded Networks

I realized this would probably be a decent place to describe some of the work I've been doing and hopefully get some feedback. I've been working at the intersection of spatial information and network information, trying to develop ways of integrating the two types of information. This is not the "spatial networks" that are used to analyze space syntax and transportation networks. Instead, I'm interested in looking at arbitrary networks of things where some of the things are labeled with spatial location information.

So far I've analyzed drug seizure networks, shipping networks, organizational network of terrorist groups, epidemiological networks and simulated social networks. Most of this analysis has been to test new methodologies I've developed. There are two main techniques I've been working on, manipulating network scale and aggregation and visualizing spatial dependencies in network topology.

There is a fundamental discord between network and space. Space is continuous; relationships in networks are defined as between discrete entities. This means that some level of aggregation of space (implicit or explicit) us required in order to do a meaningful analysis of the network. Different levels of aggregation can lead do quite different networks. I've been working on methods of capturing the tradeoffs of aggregation versus precision.

Second, visualization of social networks has been important historically in providing the intuition for many commonly used network statistics. My hope is that the visualization of spatially embedded network data will be similarly useful. However, simple visualizations of spatially embedded networks quickly become noisy and difficult to interpret. For this reason, I'm developing techniques for visualizing higher-level network topological properties and their interaction with spatial location. I am also working to develop a statistical measure of the spatial dependencies in structural properties of a spatially embedded network.

All of these techniques and more have been implemented in the Geospatial Network Visualizer in the ORA dynamic network analysis tool.

Sunday, March 1, 2009

Managing Bibtex and PDFs with JabRef!

A few days ago, I decided it was time for my assorted bibtex files to grow up and get organized. I also decided that my assorted pdfs of papers should also get centralized and organized. I had the crazy idea, that there might be a tool that could do both of these tasks in an open, portable format that I could easily transition if I grew tired of the tool. Unlike past endeavors, in this task I succeeded beyond all expectation. The JabRef reference manager is an open source reference manager that uses the standard bibtex format as a backend. Unlike other tools I tried, it works, simple and pretty feature-complete, it provides a centralized location to store references, notes/review/summaries of said references, as well as links to the actual pdfs (when you have them). JabRef will automatically generate bibtex keys for you, if you ask it, and if you save your pdfs with the respective key in its filename, JabRef can automatically find the file and associate it with the intended reference! Plus, like many of the other tools, it can automatically sync up to many of the online databases and download documents if you have access to them.

It also has extremely flexible keyword/category-based organization of references. Your categories can inherit either supercategories, subcategories, or neither.

Cisco VPN client in Ubuntu 8.10

I'm not sure what changed, but I can't seem to get the Cisco VPN client working in Ubuntu 8.10. In 8.04 I'd gotten things working by following instructions like these. Once again, the good folks at tuxx-home.at had a few solutions, but I couldn't get them to work. Either seg faults when it ran or compiler errors and it wouldn't. Anyway, I'd been leery, of trying vpnc since my .pcf files for the SCS at Carnegie Mellon didn't quite look like any of the examples. However, when I eventually tried this decoder with my encoded group password, everything worked out! Now I just use the NetworkManager applet to manage my vpn and it all (mostly) works.

FGLRX Sadness

I was reminded once again of why other people give up on linux when I walked over to my new desktop only to find that the 3d graphics driver, fglrx, had decided not to work. I spent several hours over the next few days trying to track down the issue, uninstalling, reinstalling, reverting to older versions of the driver, xserver, the linux kernel, all to no avail. Eventually, I gave up and am now using the 2d open source drivers. The fglrx driver had even stopped working adequately for simple window management things.

Thursday, January 29, 2009

HIDPoint Rocks

HIDPoint makes a free driver and configuration program for a variety of mice and keyboards for linux. I just installed their drivers for the Logitech MX3200 wireless keyboard and the MX600 wireless mouse and it works fabulously. All the special buttons and fancy things work! I couldn't be happier!

Wednesday, January 28, 2009

Also

I love my G1

Android HTML parsing

A few weeks ago, I spent an extended period of time trying to find an html parser that would work reasonably well in the Android mobile operating system. Long story short: fail. This site provides a helpful list of open source html parsers for java, but alas each and every one of them was too slow to be usable in android. I wasn't even trying to parse a large page, only ~100K, but it was still taking ~ 1 minute to parse. In the end, I gave up. Instead of parsing the site and redisplaying some of the info, I just manually skipped down to the content part of the page and fed the content as a String to the built-in webkit browser. Fortunately, Android makes it extremely straightforward to embed the browser in an app.