CW12 Five important things

QueensFrontQuad.jpgEach of the break-out sessions will report back on the five important things they discovered during their discussion. We've listed these things below.

Session 1.

How to operate an open source software project effectively

  1. Know what you are doing, don't just put it out on SF.net
  2. Involving users with open source, how to get more users is a problem
  3. Lower learning curve for first time users, installing easy, early success experience, demonstration videos
  4. Opt-in feedback system for usage, crash logs etc. How software is being used and what for
  5. Secure long-term stream of funding to sustain development

Software management plans and open research

  1. Funders should begin to mandate software management plans for important artefacts.
  2. Where are the repositories for software/data? How much do we trust them to last?
  3. How much should we be preserving? What is enough? Is the key to have enough information to reproduce or is it better to publish the source code?
  4. Publication of software Papers. A DOI for software - following the paper publication mechanism.
  5. Reward mechanism for published code is needed - Social or Academic.

Workflow development

  1. Need people to describe their services and workflows - need incentives to encourage this.
  2. Problems surrounding use and re-use of services and workflows. No equivalent of paper citation
  3. Need workflow patterns be identified / specified?
  4. What can be done to ensure that theoretically pluggable services actually are?
  5. The concept of control and dataflow in workflows should be extended to other areas e.g. modelling and simulation.

Writing code for everybody

  1. Researchers aren't currently motivated to write code for everybody
  2. The need for change is recognised
  3. If change is going to come about, there needs to be a way of rewarding good software development
  4. Funding to develop software, career recognition for wanting to develop good code
  5. Researchers need to be taught best practise.

Digital humanities

  1. Need for text mining tools which are agile for teaching
  2. Need to understand the cultural aspects of software localisation
  3. Need for help dis ambiguities geo-tagged images and video
  4. Need for collaborations on new digital humanities projects, Open Domesday Book

Building research and communications networks across disciplines

  1. How to find out about opportunities for collaborations - meet others? Need mechanisms to tell researchers about each other to enable collaborations
  2. How do researchers find out about funding for collaborations
  3. Find good science communicators to spread the word be that internally or to schools etc
  4. There is funding to bring individuals together - how do people find out about it
  5. Getting people in general and researchers in particular interested in computing at an early age - aim when they're young

Grand challenges

  1. Use of existing technical solutions - workflows, general purpose software toolkits, controlled vocabularies to act as middle-tier for collaboration
  2. Challenge segregation by organising institutions around projects rather than disciplines, re-invention of institutional IT accordingly
  3. Training in collaboration - digital science for non-technical disciplines, and encouraging computer scientists to learn the languages of other communities
  4. Development of a general-purpose, usable, digital infrastructure for communication, data sharing
  5. Research councils to co-ordinate funding calls and proposal reviews to encourage long-term successful cross-discipline projects

Session 2.

The role of virtual laboratories in research

  1. There is no consensus what a VL is (and how does it relate to VO?)
  2. There are many different scopes & scales of potential VL (people, challenge)
  3. It's not just a technical solution: social interactions too
  4. In silico processing supported by automated instruments and fieldwork
  5. Openness, sharing (and scope of sharing) and trust

How to blog and run a blog

  1. Can be deleterious to you if you say anything contentious
  2. Non-interactive blogs are internet clutter
  3. Technical posts in institution blogs are more popular
  4. Remember that a blog stays up and can be seen (publicly?) including future employers - think of your digital footprint
  5. A way of getting information out, quickly, informally.

Big data

  1. BigData is difficult to define and depends on the context, examples from big ontologies to 10000 genome sequence project
  2. Algorithms need to change to deal with big data challenges and use existing hardware more efficiently (compute and storage)
  3. Need to educate people of big data problems, even breaking down to particular hardware to buy
  4. Research councils need to be aware of the avalanche of new data, not fund resources, but fund science might not be the optimal
  5. Data management policies need to be in place

Teaching programming to scientists

  1. Teaching some level of programming early (school, undergrad) is vital.
  2. Teaching the concepts behind programming: loops, branches, conditionals etc. is perhaps more important: learning different syntaxes should be easier on top of this.
  3. Need an approach which balances the two: concepts + examples in a particular language, but in a relevant context for researchers. Throwing in a CS101 module is probably not going to work.
  4. Teaching broader aspects - design, engineering approaches, version control, test etc - is also important.
  5. Teaching "effective use of the command line" - chaining programs together for eg. data analysis - is important too.

Software attribution, citation and credit mechanisms

  1. Should we use academic methods like paper citations? Or other forms of social credit like open source kudos mechanisms, e.g. Stackoverflow.
  2. Spreading the reward between reuse and writing reusable code.
  3. Is software and data credit exchangeable with research credit?
  4. role of funders. Try a Stack overflow research council experiment?
  5. Lack of incentivisation produces problems for research.

Pro-environmental behavior change

  1. There isn't an established method to measure the impact of the embodied carbon of products and activities for example what is the embodied carbon of a oranges from Spain?
  2. Unintended consequences of people reducing their carbon impact, the Jevon's Paradox
  3. A living lab or pilot with a sample of community could be a good place to start and try and identify patterns of change in behavior and the impact of such changes
  4. This app would mine data from other apps, boarding pass apps, an oyster app, your paypal account - as apps develop and your phone becomes a much more integral part of your life and from smart meters and smart appliances in the home.
  5. We should be getting people to think about their total CO2 emission over their lifetime rather than a point in time

Mechanisms for funding software

  1. No coherent policy for funding SW in the UK
  2. How to fund blue sky software development
  3. The CCPs are useful but software development is not their primary goal

Session 3

Bringing together representatives of the research community

  1. Put together an email list for all community people - and only Gillian and Simon can post to it - event updates etc.
  2. Community and Campus champions provide access to national and international infrastructure
  3. There's a good opportunity for community reps to feedback to their university through the advisory/collaboration board meetings
  4. Digital Research 2012 (or whatever its called) might be useful forum for everyone to meet up
  5. Community reps might not know what the different communities

Successful collaboration with computer scientists

  1. Don't talk to computer scientists.
  2. Researcher-developer or computational scientist is what you need.
  3. Get them involved early.
  4. Include costings or people this in a grant.
  5. Different degrees of hardnenedness exist. Think about whether you want something usable for your group in future, or a full hardened project?

Software testing and an introduction to continuous integration

  1. Automated testing environments are not a panecea.
  2. There are many silver bullets, but there are even more werewolves.
  3. Different technologies provide different interfaces for testing environments.
  4. Research objectives are often not concerned with testing for legacy cases.
  5. The choice of test environment depends on the application

Software Catalogues

  1. There's more than one audience for such catalogues: end-users, developers, funders, publishers
  2. Previous efforts have been unsuccessful: we need to learn from their mistakes
  3. Source code repositories are becoming registries as they add collaborative features
  4. Features of High quality registries (good coverage, up-to-dateness) are orthogonal to features high quality software (suitability, reliability)
  5. We need to reduce the cost of maintenance - this is possible

Session 4

Using the internet and social media to increase your impact and publicise research to the public and research community

  1. Think outside the box - your research could be viewed by 8000 members of the public each day in a science museum
  2. Engage with young people - can your research link into school curricula
  3. Use Klout or similar to validate your tweet/blog impact
  4. Use novel methods to display results e.g., you tube videos, simulations
  5. Know the ways when writing to improve your display ranking in search engines

Managing copyright and licence issues

  1. Universities and the individuals in their IP / tech transfer groups tend to be very unclear about what sort of licensing and use they might permit, and sometimes about who owns the copyright in what
  2. There's no clear set of answers for researchers: what licence to use and why, whose copyright should I write at the top, am I even allowed to do what I want to do etc
  3. We don't know how much different institutions have in common
  4. Different disciplinary communities might find different licences more comprehensible or suitable -- e.g. some disciplines might be more familiar with CC (is CC0, i.e. public domain, a suitable licence ever?)
  5. Licensing is both a quagmire and a bit of a pain -- unlike time spent improving your work in other ways, there's no obvious upside to time spent thinking about licences -- it's more about avoiding risks and dealing with the lingering uncertainty

Developing the profession of a scientific software engineer and the career track of software developers in academia

  1. We need a label/ name for a new profession: “Research Software Engineer”
  2. Creating an institution or professional Body –SSI could help, BCS “Chartered eng.”
  3. Certification needed: BCS “Chartered eng.”
  4. Education: each other and UGs
  5. Recognition and progression where are you based after being project based. Industry – academic movement is helpful Industrial experience is valuable.

Best practices for documenting scientific software development

  1. There is not just reader of documentation, each have their own requirements e.g. users versus developers. Users may start as black box users but soon want to understand the internals and tweak.
  2. Papers describe how it works, software developer manuals describe what it does, neither are the same as describing how to use it.
  3. A 2 page, 10 minute quick start guide is essential which provides the initial uptake - without even that no uptake.
  4. Dynamic approach to documentation e.g. archived mailing lists and issue trackers, generate documentation on demand.
  5. Bad documentation can deter users. Good documentation can persuade users to make the effort because you have.

Cloud Computing

  1. We have fifteen years of experience with distributed computing in a Grid context and this should not be thrown away. Lessons of authorisation and federation are particularly relevant.
  2. Financial issues, e.g credit card payment. Academic institutions don't like to see money go out of the institution to external agencies. Issues: how credit card payments are accounted against projects.
  3. Why are people using Cloud? Convenience, immediate access for cash, attractive for people who needs computing in bursts. May not be so good for heavy users. Balance of private/public provision of computing resource.
  4. IPR issues are very important, how much control do you have over your data in the Cloud?.
  5. Software as a service for academically produced software is an attractive idea. Dont have to have multiple of software for different OS's, lower entry point for users. E.g NeuroDebian, BioLinux.

Software Localisation

  1. Immense number of examples which potentially require adaption to different target markets.
  2. Impossibility to create one exhaustive list of culturally relevant aspects of software.
  3. It is not clear what aspects of software are affected by culture.
  4. Is "localisation" as a term applicable to "domain culture", compared with "national culture"?
  5. Assumptions about cultures are difficult to avoid as making assumptions is an integral part of developing software.