It’s been more than a month since I left Leeds, but my mind often returns to the days spent in lovely West Yorkshire. I start by thanking everyone at LIDA and CDRC for making my stay so productive and enjoyable.
The visit began at the ECTQG conference in York, which was a great opportunity to first meet with LIDA’s faculty and students and to be introduced to some of their research. The international setting of these first encounters enabled all of us to set our exchanges in a rich, multidisciplinary context that was very valuable. At the conference, I was also pleased to attend the Alan Wilson Symposium. Commemorating more than a half century of ideas and research about urban data and analytics was a useful reminder that a field of research needs good theory and methods to truly benefit from technological advances! I was similarly privileged to hear presentations at LIDA’s Annual Meeting. The meeting was an opportunity to learn about the institution building processes used to garner the massive sets of data that are being generated in our times.
Meetings with LIDA’s researchers, faculty, and students were of course the highlight of my stay. I was impressed with the creativity with which sets of “big” data were being used to find answers to difficult questions. Electronic transit cards, food permits, crime records, supermarket receipts, APP-generated data on location and activity, e-shopping, etc., all served as platforms to build knowledge by better understanding behaviors (e.g., travel or shopping patterns) or to provide feedback to improve behaviors (e.g., eating or exercising). The partnerships established between private sector entities, service providers, and researchers looked to me to be a most promising model for research, especially as they led to students doing research in the hybrid environment mixing science and business. This arrangement seemed to provide students with the assurance that they trained in research using advanced methods and that their work was also relevant to the world outside of academia. At the same time, the arrangement added internal tensions (and perhaps even conflicts…) as each student worked to balance measures of usefulness and scientific rigor in their research
I was happily surprised to find students and faculty coming from a variety of disciplines in the social sciences and working with large secondary datasets. In my university, “big” data appear to stay in the hands of computer scientists who do not have much of a background in the social sciences. Expectedly, their research often remains superficial in contents and of limited use in the policy arena. In contrast, I saw that at LIDA, the collaborations between social, computer, and data scientists helped insure that research is relevant to policy and decision makers, at the same time as it is duly subjected to the checks and balances of the peer-review process (in my experience, most research results in computer science appear in conference proceedings rather than in peer-reviewed journals; and the contents of the conferences is not consistently indexed, difficult to search for, and hence difficult to access).
As far as my own presentations were concerned, they focused on theories and methods that guide the study of the potential role of the built environment (where people live and work) in shaping many types of behaviors, including travel, active living, and healthy eating. Transport planners have long considered the built environment (also characterized as land use or urban form) as part of their tool kit to manage travel demand. More recently, public health decision makers have joined in to consider the built environment as potentially instrumental in health behavior change. The now almost ubiquitous use of GPS devices to capture location and movement naturally invites further, more targeted explorations into the effect of the environment surrounding us on how we behave. GPS data and data from sensors monitoring activity and health conditions can now readily be integrated with built environment data using GIS, thus producing more detailed and precise information on activity in time and space. Many of the Leeds students were familiar with GIS or used GIS to probe into the geospatial dimension of their research. However, time was lacking in our encounters to dig deeply into approaches and methods of spatial analysis. It seemed that my lab’s work in exploring time-based measures of exposure to the built environment was novel to most students. In the seminars, we addressed the issue that many researchers lacked training in spatial thinking and analysis and therefore tended to consider the environment as something that individuals control, but less as something that can also control them (e.g., the availability of a bike lane will likely influence the choice of safely cycling to work or recreation). These considerations gave us the opportunity to discuss the issue of direction of causation (e.g., does the presence of a subway explain an individual’s use of said subway, or does an individual’s decision explain his/her choice of using the subway for travel?), which complicates predictive modeling given the fact that individuals cannot be randomly assigned to different environments. Nonetheless, we reckoned that with the growing awareness that most people now live in cities, coupled with the new generation’s understanding of the need for “sustainable” behaviors, comes the realization that environment does matter; that it does influence at least in part an individual’s choice of behavior. It is in this context that I felt that the work I shared with students resonated for them.
It was a pleasure to interact with the students, especially that there were all actively and creatively engaged with both what I presented and with their own work. I sincerely hope to be able to continue the dialogue and look forward to hearing from everyone.
In more general terms, my stay at LIDA was an opportunity to think about “big data.” No doubt the term is both imageable and simplistic. And for someone like me who, at a young age, was impressed by Schumacher’s book Small Is Beautiful, it’s also suspicious! I first got a glimpse of a big data future with the advent of cadastral data (tax-lot, parcels). We were already using the 1990 digitized census, but the parcel data was 1500 times more granular than the census at the tract level. Built environment data that had been painfully collected by hand for a few city blocks (on less than a square mile), could then be had for 1.2 million parcels (on 1000 square miles) from the touch of a key board (so to speak since, at the time, researchers needed to be familiar with ARCINFO software, and computing power was such that we used the motto “raster is faster but vector is correcter”). My second encounter with a big data future came in the mid-2000s with the advent of GPS and accelerometers, which my colleagues and I first used on some 700 people to capture their level of physical activity in space and time. Since then, combining data from individual-level sensors and from environmental data in GIS has been my main exposure to big data. This type of data is what we might term “primary data,” collected under controlled conditions for the purpose of answering specific research questions.
In the collective mind, however, “big data” applies to the reels of data that, today, come associated with just about any human activity involving a computer, a cell phone, a credit or loyalty card, etc. These are the truly imageable large (and very large) data sets resulting from the all-presence of miniaturized sensors, cameras, magnetic tapes, bar codes, contactless chips, etc., embedded in our environment and our everyday “machines.” These data accumulate continuously as we go through daily life (I try to imagine how many GPS points Google or Apple are collecting from people worldwide as I am typing these words). The popular press and, let’s admit, academia, are mesmerized by these data: it’s deus ex machina all over again. For academic research, big data represent a wealth of secondary data the likes of which have not been seen before. However, access to these data has proved difficult because of proprietary and privacy issues. Controversies on these issues are increasingly brought into the limelight because of how the data are used to manipulate human behavior. And the use of these data in academia can be constrained by research protocols that demand a transparent code of ethics. That’s where LIDA comes in, offering an impressive setting where issues of data ownership and privacy can be successfully addressed, and big data can be used to positive and productive ends! Creating LIDA clearly required hard work, but the Institute now provides an institutional framework where big data can become secondary data for social scientists to ask questions from, and to get results that can be fed back into the public policy arena. This is a huge step which takes the application of advances in data science beyond market research to social problems. Of course, I am particularly interested in applications to health and transport, and look forward to finding out how your various projects will address such issues as sampling for generalization purposes, and testing data validity/reliability.
I am grateful that time was available to explore your region’s places and cities. As expected, the City of York was wonderful. I especially liked walking the city walls and discovering its history during the Vikings era. Yet Leeds was more of a special discovery for me. First, I was taken by the spectacular architecture of some of the buildings—old and new, ordinary apartments or offices with shops at the street level, as well as the numerous arcades, some of which were amazingly elegant, and the many 19th and early 20th century public buildings. Perhaps more importantly, I was happy to find the center city alive at all times of day and in the evenings, with a mix of people shopping, working, eating, drinking, or just strolling. I joined the crowds and ate Yorkshire pudding at a Sunday night dinner with a choice of roasts at the Reliance restaurant. At Browns, only the company was better than the food! I also delighted in finding the Moore Institute because Henry Moore has long been one of my favorite artists. And I was lucky to catch the small but enchanting Takamatsu exhibit there. I also found out about the history of the Quarry Hills Flats, a large housing project modeled after Vienna’s Karl-Marx-Hof, built in 1938 and demolished in 1978 (nice vimeo at https://vimeo.com/45835112). Had this project, on which site now lies a yet to be integrated collection of institutional buildings (Quarry House, Leeds School of Music, etc., with the newer Victoria Gate mall across the street), been saved, it would be considered historic, a testament to the innovative mix of social, architectural, and engineering thinking of the 1930s.
Visits to Halifax and Saltaire were memorable as well. The Piece Hall was in full use on the Saturday of my visit. There were lots of local families and some tourists enjoying the mild weather and opportunities for eating and shopping. I also saw Halifax market place, a structure hardly changed from the 19th century, with stalls now filled with foods and goods of all kinds. The town of Saltaire is impressive in the way it now commemorates the best of our industrial past, where habitat and industry merged. It was a treat to visit the exhibits of David Hockney’s life and work, another one of my favorite artists!
I took many photos of the different places. But for some reason, I have only a handful of photos catching interactions with people. This is a good excuse for coming back to visit you!
Again, MY WARM THANKS to everyone for the opportunity to work with you. I look forward to continuing our collaboration.