UWA Living Lab Kicks Off

‘Innovators find it difficult to access sites to prove up results and miners are averse to trialling/introducing innovation without proven results’ -METS Ignited 2017

The UWA Living Lab project aims to help bridge this gap.

The Living Lab project funding was officially announced recently at the METS Ignited event at the CORE Innovation Hub. The launch funding is being supplied through the METS Ignited Collaboration Project Fund, from the BHP Fellowship for Engineering for Remote Operations. This is a partnership with CORE Innovation Hub and the UWA Facilities Management group.

More details are available on the UWA website.

Meta Project Thoughts

As engineers/scientists whenever we embark on a new project we are generally full of enthusiasm and excitement, and are raring to get going. This is fantastic, and this enthusiasm should not be damped, but it is worth while to take a few moments to think about how you should record/backup the work that you are doing.

Project Name

Give a few minutes thought to the name of your project. The project name will hopefully be something catchy and easy to remember, and will bring out the key features you are investigating. You should avoid any brand names/company names.

References

One of the first tasks when undertaking a new project is doing a literature review. This will generally generate a lot of references (these may be published articles, websites, books etc). To quickly and easily record all these it is recommended to use a reference cataloguing system such as Zotero.

Versioning

If a lot of your work is computer based (who’s isn’t these days?), then it is highly recommended to use some sort of version control system. The one with the most traction at the moment is called git. Git is certainly not easy and intuitive to use, but it is rather powerful. To make life a little easier, there is an online website called GitHub which removes some of the pain associated with using git.

Backing Up

It is very well to have a version control system to be able to review/recreate your work from any time in the past, but it is very important to realise that this is an archive and NOT a backup. At first glance, the version control system might seem like a backup, but it is simple to see the difference by thinking about what would happen if you wanted to look at an old version of a file, but discovered that the hard disk containing your repository had been corrupted. This old version would be gone forever, because it was only contained in the primary, active repository. This repository needs to be backed up. Another analogy is that of a state library. They contain many archives, but if the library burns down these are gone forever, UNLESS they have a copy (or back up) at some other location.

One of the very important criteria that I have discovered for a backup system is automation. If the backup system is not completely automatic then, in general, it simply will not be maintained.

If you are using GitHub rather than just git, you might decide that maintaining a cloud copy of your repository is sufficient backup. Otherwise, you could choose to back up to a local external hard-drive or use another cloud service.

Collaborating

A Google Docs folder can be useful for rapidly developing a list of resources/ideas for a project in collaboration with colleagues because it always for real-time co-authoring of a document, however there are a couple of issues to be wary of. Firstly, the contents of a Google Document does not exist ANYWHERE except on GOOGLE SERVERS or in closed-source GOOGLE APPS. I think that this is a major flaw/weakness in their system. There is no guarantee that you will be able to access this document in the future. You could forget/lose your account, google could cancel your account, somebody with whom you share the file could delete it. A possible solution to this is to continue to use google, but to periodically export the document to a docx/xlsx/pptx file, which can then be version controlled. Secondly, google drive does not work very well with git, and it is possible to break your repository if a folder is a git repository housed within a google drive directory.

There are other real-time collaboration tools available. If writing in LaTeX, these include ShareLaTeX and OverLeaf. Office 365 also has the ability to do real-time coauthoring, however I have not used this.

Look at your project from a variety of perspectives

Generally there is more than one way to think about any project. It can often be helpful to think about the project from a variety of different angles. For example, your project might use a variety of components so you might like to break your project down into the different components used. Your project will also generally require a variety of different skill-sets. You can divide your project up into the various skill-sets that you will need (e.g. coding, electronics, analysis).

FLOC – Machine Learning meets Formal Methods workshop, Oxford

This one-day workshop 13 July 2018 brought together the Machine Learning and Formal Methods communities. Here is a summary of some take-aways. Highlights were the talks by Pushmeet Kohli from Google DeepMind UK, Alison Lowndes from NVIDIA, and Adnan Darwiche UCLA. Melinda Hodkiewicz (SHL)and Ashwin D’Cruz (ex-SHL now working for Calipsa in London) attended. https://www.floc2018.org/summit-on-machine-learning/.

Pushmeet Kohli (Google DeepMind): Challenges for AI are to ensure it is a) robust to adversaries, b) generalises well to variations in the real world, c) it is fair, d) it is compliant with regulations. When talking about ‘fairness’, he split the discussion into the “What do we mean by fair?” and “How to make AI fair”. He did not answer the “what” question. Instead saying that “this needed to be set in regulations”.   As far as the “how”, Kohli suggested three steps 1) rigorous testing, 2) developing robust AI, and 3) verifying AI systems. A significant challenge for AI is that test set evaluation approaches commonly used in ML are inappropriate for 1) adversarial environments and 2) safety critical domains. In safety critical domains loss functions are unbounded. Also you would need a lot of samples of bad events for test set evaluation, and we can’t afford to do this. He then gave some examples of work his team at Google is doing (see his recent ICML, ICLR and NIPS papers) and left us with the idea that we might need a new language for AI which has suitable inductive bias (set of assumptions that the learning algorithm uses to predict outputs given inputs it has not encountered) and with the right expressiveness to describe what’s going on.

Comment: Melinda asked the room if there was anyone at the workshop working for legislators or regulators; there was not. It is not clear how legislators are going to develop workable regulations and regulators have the capacity to assess practice against these regulations without a good understanding of the issues being discussed at these types of events.

Alison Lowndes (NVIDIA): NVDIA has developed massive simulation platforms and Alison talked about their work on Jetson Xavier, an AI computer for autonomous machines delivering GPU workstation performance in a single embedded module https://developer.nvidia.com/jetson-xavier-devkit. She observed that while Reinforcement Learning was highly fashionable (80 papers/day published on arXiv), it is not commercial yet. Classical ML (SVM, MLP, GBDT) are still very relevant and widely used. Convolution neural networks are widely used. She expressed concern about the “common person’s voice in the room” and suggested that philosophy and psychology will become more important. Finally  relevant for the SHL and Makers she said that educational institutions can get a free DevKit from NVDIA https://developer.nvidia.com/teaching-kits

Andre Platzer (Carnegie Mellon) talked on ‘safe’ reinforcement learning via formal methods with a focus on safety critical systems. How do you demonstrate that the algorithm is “provably safe”? He talked about the need to 1) learn safety, 2) learn a safety policy, and 3) verify and about the issue of “what if the model is incorrect”? As far as safety policy, this appears to be based on the idea of “have we seen this output before and it was ok, then it should be safe, given the same context”, how do we know to trust this and how do we know if the context has changed?   Andre runs the Logical Systems Labs http://www.ls.cs.cmu.edu/ and has a text book on Logical Foundations of Cyber-Physical Systems.

Sumit Gulwani from Microsoft talked abotu their PROSE kit https://microsoft.github.io/prose/. This is about programming by examples – the automatic generation of programs from input-output examples. It can build programs in various languages such as Python/R/C# and some of its functionality is baked in Excel. Some scripts that are now tedious to write can be automated.

Marta Kwiatkowska (Oxford) talked about her team’s work on ‘safety verification for deep learning networks with proven guarantees’. She made the point that while there are an infinite set of possible outcomes we only measure ‘accuracy’ on a finite data set.  She demonstrated how deep learning networks are unstable to adversarial perturbations using image processing of a car sign (plenty of papers on this on arxiv) and asked ‘how can we verify that such behaviour cannot occur’.  Marta’s group at Oxford is involved in modelling and automated verification techniques for software systems. One of the current projects in her group is safety and trust for mobile autonomous robots.   http://www.cs.ox.ac.uk/people/marta.kwiatkowska/research.html.

Adnan Darwiche from UCLA presented on “what just happened in AI”.  It draws on his recent paper “Human-Level Intelligence or Animal-Like Abilities” https://arxiv.org/abs/1707.04327.  He made a number of interesting observations 1) Lots on new AI applications, 2) AI has been around >50 years, 3) The AI curriculum is almost unchanged. Essentially every behaviour can be captured to some extent by a function. We are now building bigger functions and we have more data.  A deep learning NN is a function and architecting the structure of a NN is function engineering.  Next he moved onto how our perception of value has changed. Model-based approaches try and understand a system whereas ML models translate without insight. We have realised in many cases (e.g. social media) you don’t need the understanding to be useful. The ease at which we can get results that may be the same or only slightly better than what we can do with model based methods is very attractive.  However he warned about the growing gap between hype and reality and reminded the audience of a period he described as the “AI winter”. He warned about a lost generation of AI researchers who are well versed in NN models but not in Logic, the need to understand the limitations of function-based approaches and to characterise deep learning functions in a scientifically precise manner. It’s worth reading his paper from the link above for a fully discussion of his concerns.

 

 

Learning from digital humanities for the Siri for Maintenance project

This week Melinda talked with Dr. Beatrice Alex  who is in computational linguistics at the University of Edinburgh. I learned a lot about when and how to include subject matter experts in the NLP and semantic object identification pipeline. We can take some of the lessons she learned in their digital humanities project into our Siri for Maintenance project. Beatrice’s homepage is http://homepages.inf.ed.ac.uk/balex/

Attaching the BlueBox to rotating machinery

Today the SHL managed to get a hold of an old and broken record player.

After disassembling the record player we discovered that the fault was a deteriorated elastic belt. We made do with what was available and replaced the belt with a piece of string. Whilst the record player was open we also replaced the wires running to the 12V DC motor with wires leading to a DC power supply unit. This allowed us to control the spin of the record player by varying the supplied voltage – the record player no longer just runs at either 33 or 45 revolutions per minute. We were then able to place a couple of BlueBoxes on the record player and get live data measuring the acceleration caused by the rotation of the disc bed.