原文链接:[BUILDING A NEURAL NET FROM SCRATCH IN GO](https://www.datadan.io/building-a-neural-net-from-scratch-in-go/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com)
I'm super pumped that my new book [Machine Learning with Go](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-go) is now available! Writing the book allowed me to get a complete view of the current state of machine learning in Go, and let's just say that I'm pretty excited to see how the community growing!
我很高兴我的新书[Machine Learning with Go](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-go)现已推出!写这本书让我可以全面了解Go中机器学习的现状,我很高兴看到社区如何成长!
In the book (and for my own edification), I decided that I would build a neural network from scratch in Go. Turns out, this is fairly easy, and I thought it would be great to share my little neural net here.
(If you are interested in leveraging pre-existing Go packaging for machine learning, check out [all the great existing packages](https://github.com/gopherdata/resources/tree/master/tooling), and be sure to watch [Chris Benson's recent talk](https://youtu.be/CHzMEamGZDA) at GolangUK about Deep Learning in Go)
There are a whole variety of ways to accomplish this task of building a neural net in Go, but I wanted to adhere to the following guidelines:
在Go中完成构建神经网络的任务有多种方法,但我想遵循以下准则:
-**No cgo** - I want my little neural net to compile nicely to a statically linked binary, and I also want to highlight the numerical functionality that is available natively in Go.
-**gonum matrix input** - I want supply matrices to my neural network for training, similar to how you would supply `numpy` arrays to most Python machine learning functions.
-**Variable numbers of nodes** - Although I will only illustrate one architecture here, I wanted my code to be flexible, such that I could tweak the numbers of nodes in each layer for other scenarios.
This type of single layer neural net might not be very "deep," but it has proven to be very useful for a huge majority of simple classification tasks. In our case, we will be training our model to classify iris flowers based on the [famous iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set). This should be more than enough to solve that problem with a high degree of accuracy.
Each of the **nodes** in the network will take in one or more inputs, combine those together linearly (using **weights** and a **bias**), and then apply a non-linear **activation function**. By optimizing the weights and the biases, with a process called [**backpropagation**](https://en.wikipedia.org/wiki/Backpropagation), we will be able to mimic the relationships between our inputs (measurements of flowers) and what we are trying to predict (species of flowers). We will then be able to feed new inputs through the optimized network (i.e., we will **feed** them **forward**) to predict the corresponding output.
(If you are new to neural nets, you might also check out [this great intro](https://ujjwalkarn.me/2016/08/09/quick-intro-neural-networks/), or, of course, you can read the relevant section in [Machine Learning with Go](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-go).)
We also need to define our activation function and it's derivative, which we will utilize during backpropagation. There are many choices for activation functions, but here we are going to use the [sigmoid function](http://mathworld.wolfram.com/SigmoidFunction.html). This function has various advantages, including probabilistic interpretations and a convenient expression for it's derivative.
With the definitions above taken care of, we can write an implementation of the [backpropagation method](https://en.wikipedia.org/wiki/Backpropagation) for training, or optimizing, the weights and biases of our network. The backpropagation method involves:
1.Initializing our weights and biases (e.g., randomly).
2.Feeding training data through the neural net forward to produce output.
3.Comparing the output to the correct output to get errors.
4.Calculating changes to our weights and biases based on the errors.
5.Propagating the changes back through the network.
6.Repeating steps 2-5 for a given number of **epochs** or until a stopping criteria is satisfied.
1.初始化我们的权重和偏差(例如,随机)。
2.通过神经网络向前馈送训练数据以产生输出。
3.将输出与正确的输出进行比较以获得错误。
4.根据错误计算我们的权重和偏差的变化。
5.通过网络传播更改。
6.对于给定数量的**批次**重复步骤2-5或直到满足停止标准。
In steps 3-5, we will utilize [**stochastic gradient descent**](https://en.wikipedia.org/wiki/Stochastic_gradient_descent)(SGD) to determine the updates for our weights and biases.
To implement this network training, I created a method on `neuralNet` that would take pointers to two matrices as input, `x` and `y`. `x` will be the features of our data set (i.e., the independent variables) and `y` will represent what we are trying to predict (i.e., the dependent variable). I will show an example of these later in the article, but for now, let's assume that they take this form.
The actual implementation of backpropagation is shown below. **Note/Warning**, for clarity and simplicity, I'm going to create a handful of matrices as I carry out the backpropagation. For large data sets, you would likely want to optimize this to reduce the number of matrices in memory.
After training our neural net, we are going to want to use it to make predictions. To do this, we just need to feed some given `x` values forward through the network to produce an output. This looks similar to the first part of backpropagation. Except, here we are going to return the generated output.
Ok, we now have the building blocks that we will need to train and test our neural network. However, before we go off and try to run this, let's take a brief look at the data that I will be using to experiment with this neural net.
The data I'm going to use is a slightly transformed version of the popular [iris data set](https://archive.ics.uci.edu/ml/datasets/iris). This data set includes sets of four iris flower measurements (what will become our `x` values) along with a corresponding indication of iris species (what will become our `y` values). To utilize this data set with our neural net, I have slightly transformed the data set, such that the species values are represented by three binary columns (1 if the row corresponds to that species, 0 otherwise). I have also added a little bit of random noise to the measurement to try and confuse the neural net (because this problem is pretty easy to solve otherwise):
That gives us a trained neural net. We can then parse the test data into matrices `testInputs` and `testLabels` (I'll spare you these details as they are the same as above), use our `predict()`method to make predictions for the flower species, and then compare the predictions to the actual species. Calculating the predictions and accuracy looks like the following:
Woohoo! 97% accuracy isn't too shabby for our little from scratch neural net! Of course this number will vary due to the randomness in the model, but it does generally perform very nicely.
我希望这对你来说是有益的和有趣的。所有的代码和数据都是[可在这里](https://github.com/dwhitena/gophernet),所以自己试试吧! 此外,如果您对Go for ML / AI和Data Science感兴趣,我强烈建议:
I hope this was informative and interesting for you. All of the code and data is [available here](https://github.com/dwhitena/gophernet), so try it out yourself! Also, if you are interested in Go for ML/AI and Data Science in general, I highly recommend:
- Joining [Gophers Slack](https://invite.slack.golangbridge.org/), and participating in the #data-science channel (I'm @dwhitena there)
- Checking out all the great Go ML/AI/data tooling [here](https://github.com/gopherdata/resources/tree/master/tooling)
- Following the [GopherData blog/website](http://gopherdata.io/) for more interesting articles and community information
| [BUILDING A NEURAL NET FROM SCRATCH IN GO](https://www.datadan.io/building-a-neural-net-from-scratch-in-go/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | |
| [A Quick Introduction to Neural Networks](https://ujjwalkarn.me/2016/08/09/quick-intro-neural-networks/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | |
| [Practical Data Science in Python](https://radimrehurek.com/data_science_python/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | |
\ No newline at end of file
| [BUILDING A NEURAL NET FROM SCRATCH IN GO](https://www.datadan.io/building-a-neural-net-from-scratch-in-go/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | 介绍用Go构建一个简单的神经网络 |
| [A Quick Introduction to Neural Networks](https://ujjwalkarn.me/2016/08/09/quick-intro-neural-networks/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | 神经网络的快速入门 |
| [Practical Data Science in Python](https://radimrehurek.com/data_science_python/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | 用垃圾邮件分类完整的介绍了建立机器学习模型过程(Python2) |
原文链接:[Machine learning skills for software engineers](https://www.infoworld.com/article/3223688/machine-learning/machine-learning-skills-for-software-engineers.html?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com)
### You don’t need to be a data scientist to do machine learning, but you do need data skills. Start with these
![Machine learning skills for software engineers](https://images.techhive.com/images/article/2017/04/1_proven-skills-development-100719359-large.jpg)
Thinkstock
*Ted Dunning is chief applications architect at MapR Technologies.*
A long time ago in the mid 1950’s, Robert Heinlein wrote a story called “A Door into Summer” in which a competent mechanical engineer hooked up some “Thorsen tubes” for pattern matching memory and some “side circuits to add judgment” and spawned an entire industry of intelligent robots. To make the story more plausible, it was set well into the future, in 1970. These robots could have a task like dishwashing demonstrated to them and then replicate it flawlessly.
I don’t think I have to tell you, but it didn’t turn out that way. It may have seemed plausible in 1956, but by 1969 it was clear it wouldn’t happen in 1970. And then a bit later it was clear that it wouldn’t happen in 1980, either, nor in 1990 or 2000. Every 10 years, the ability for a normal engineer to build an artificially intelligent machine seemed to retreat at least as fast as time passed. As technology improved, the enormous difficulty of the problem became clear as layer after layer of difficulties were found.
**[ Review: TensorFlow, Spark MLlib, Scikit-learn, MXNet, Microsoft Cognitive Toolkit, and Caffe machine learning and deep learning frameworks. | Roundup: 13 frameworks for mastering machine learning. | Cut to the key news and issues in cutting-edge enterprise technology with the InfoWorld Daily newsletter. ]**
It wasn’t that machine learning wasn’t solving important problems; it was. For example, by the mid-90’s essentially all credit card transactions were being scanned for fraud using neural networks. By the late 90’s Google was analyzing the web for advanced signals to aid in search. But your day to day software engineer didn’t have a chance of building such a system unless they went back to school for a Ph.D. and found a gaggle of like-minded friends who would do the same thing. Machine learning was hard, and each new domain required breaking a significant amount of new ground. Even the best researchers couldn’t crack hard problems like image recognition in the real world.
I am happy to say that this situation has changed dramatically. I don’t think that any of us is about to found a Heinlein-style, auto-magical, all-robotic engineering company in the near future, but it is now possible for a software engineer without any particularly advanced training to make systems that do really amazing stuff. The surprising part is not that computers could do these things. (It has been known since 1956 that this would be possible any day now!) What is surprising is how far we’ve come in the last decade. What would have made really good Ph.D. research 10 years ago is now a cool project for a weekend.
## Machine learning is getting easier (or at least more accessible)
In our forthcoming book “Machine Learning Logistics”(coming in late September 2017 from O’Reilly), Ellen Friedman and I describe a system known as TensorChicken that our friend and software engineer, Ian Downard, has built as a fun home project. The problem to be solved was that blue jays were getting into our friend’s chicken coop and pecking the eggs. He wanted to build a computer vision system that could recognize a blue jay so that some kind of action could be taken to stop the pecking.
After seeing a deep learning presentation by Google engineers from the TensorFlow team, Ian got cracking and built just such a system. He was able to do this by starting with a partial model known as [Inception-v3](https://www.tensorflow.org/tutorials/image_recognition) and training it to the task of blue jay spotting with a few thousand new images taken by a webcam in his chicken coop. The result could be deployed on a Raspberry Pi, but plausibly fast response time requires something a bit beefier, such as an Intel Core i7 processor.
And Ian isn’t alone. There are all sorts of people, many of them *not* trained as data scientists, building cool bots to do all kinds of things. And an increasing number of developers are beginning to work on a variety of different, serious machine learning projects as they recognize that machine learning and even deep learning have become more accessible. Developers are beginning to fill roles as data engineers in a “data ops” style of work, where data-focused skills (data engineering, architect, data scientist) are combined with a devops approach to build things such as machine learning systems.
It’s impressive that a computer can fairly easily be trained to spot a blue jay, using an image recognition model. In many cases, ordinary folks can sit down and just do this and a whole lot more besides. All you need is a few pointers to useful techniques, and a bit of a reset in your frame of mind, particularly if you’re mainly used to doing software development.
Building models is different from building ordinary software in that it is data-driven instead of design-driven. You have to look at the system from an empirical point of view and rely a bit more than you might like on experimental proofs of function rather than careful implementation of a good design accompanied with unit and integration tests. Also keep in mind that in problem domains where machine learning has become easy, it can be stupidly easy. Right next door, however, are problems that are still very hard and that do require more sophisticated data science skills, including more math. So prototype your solution. Test it. Don’t bet the farm (or the hen house) until you know your problem is in the easy category, or at least in the not-quite-bleeding-edge category. Don’t even bet the farm after it seems to work for the first time. Be suspicious of good looking results just like any good data scientist.
## Essential data skills for machine learning beginners
The rest of this article describes some of the skills and tactics that developers need in order to use machine learning effectively.
### Let the data speak
In good software engineering, you can often reason out a design, write your software, and validate the correctness of your solution directly and independently. In some cases, you can even mathematically prove that your software is correct. The real world does intrude a bit, especially when humans are involved, but if you have good specifications, you can implement a correct solution.
With machine learning, you generally don’t have a tight specification. You have data that represents the past experience with a system, and you have to build a system that will work in the future. To tell if your system is really working, you have to measure performance in realistic situations. Switching to this data-driven, specification-poor style of development can be hard, but it is a critical step if you want to build systems with machine learning inside.
### Learn to spot the better model
Comparing two numbers is easy. Assuming they are both valid values (not NaN’s), you check which is bigger, and you are done. When it comes to the accuracy of a machine learning model, however, it isn’t so simple. You have lots of outcomes for the models you are comparing, and there isn’t usually a clean-cut answer. Pretty much the most basic skill in building machine learning systems is the ability to look at the history of decisions that two models have made and determine which model is better for your situation. This judgment requires basic techniques to think about values that have an entire cloud of values rather than a single value. It also typically requires that you be able to visualize data well. Histograms and scatter plots and lots of related techniques will be required.
### Be suspicious of your conclusions
Along with the ability to determine which variant of a system is doing a better job, it is really important to be suspicious of your conclusions. Are your results a statistical fluke that will go the other way with more data? Has the world changed since your evaluation, thus changing which system is better? Building a system with machine learning inside means that you have to keep an eye on the system to make sure that it is still doing what you thought it was doing to start with. This suspicious nature is required when dealing with fuzzy comparisons in a changing world.
### Build many models to throw away
It is a well-worn maxim in software development that you will need to build one version of your system just to throw away. The idea is that until you actually build a working system, you won’t really understand the problem well enough to build that system well. So you build one version in order to learn and then use that learning to design and build the real system.
With machine learning, the situation is the same, but more so. Instead of building just one disposable system, you must be prepared to build dozens or hundreds of variants. Some of these variants might use different learning technologies or even just different settings for the learning engine. Other variants might be completely different restatements of the problem or the data that you use to train the models. For instance, you might determine that there is a surrogate signal that you could use to train the models even if that signal isn’t really what you want to predict. That might give you 10 times more data to train with. Or you might be able to restate the problem in a way that makes it simpler to solve.
The world may well change. This is particularly true, for instance, when you are building models to try to catch fraud. Even after you build a successful system, you will need to change in the future. The fraudsters will spot your countermeasures, and they will change their behavior. You will have to respond with new countermeasures.
So for successful machine learning, plan to build a bunch of models to throw away. Don’t expect to find a golden model that is the answer forever.
### Don’t be afraid to change the game
The first question that you try to solve with machine learning is usually not quite the right one. Often it is dramatically the wrong one. The result of asking the wrong question can be a model that is nearly impossible to train, or training data that is impossible to collect. Or it may be a situation where a model that finds the best answer still has little value.
Recasting the problem can sometimes give you a situation where a very simple model to build gives very high value. I had a problem once that was supposed to do with recommendation of sale items. It was really hard to get even trivial gains, even with some pretty heavy techniques. As it turned out, the high value problem was to determine *when* good items went on sale. Once you knew *when*, the problem of *which* products to recommend became trivial because there were many good products to recommend. At the wrong times, there was nothing worth recommending anyway. Changing the question made the problem vastly easier.
### Start small
It is extremely valuable to be able to deploy your original system to just a few cases or to just a single sub-problem. This allows you to focus your effort and gain expertise in your problem domain and gain support in your company as you build models.
### Start big
Make sure that you get enough training data. In fact, if you can, make sure that you get 10 times more than you think you need.
### Domain knowledge still matters
In machine learning, figuring out how a model can make a decision or a prediction is one thing. Figuring out what really are the important questions is much more important. As such, if you already have a lot of domain knowledge, you are much more likely to ask the appropriate questions and to be able to incorporate machine learning into a viable product. Domain knowledge is critical to figuring out where a sense of judgment needs to be added and where it might plausibly be added.
### Coding skills still matter
There are a number of tools out there that purport to let you build machine learning models using nothing but drag-and-drop tooling. The fact is, most of the work in building a machine learning system has nothing to do with machine learning or models and has everything to do with gathering training data and building a system to use the output of the models. This makes good coding skills extremely valuable. There is a different flavor to code that is written to manipulate data, but that isn’t hard to pick up. So the basic skills of a developer turn out to be useful skills in many varieties of machine learning.
Many tools and new techniques are becoming available that allow practically any software engineer to build systems that use machine learning to do some amazing things. Basic software engineering skills are highly valuable in building these systems, but you need to augment them with a bit of data focus. The best way to pick up these new skills is to start today in building something fun.
\---
*Ted Dunning is chief applications architect at MapR Technologies and a board member for the Apache Software Foundation. He is a PMC member and committer of the Apache Mahout, Apache Zookeeper, and Apache Drill projects and a mentor for various incubator projects. He was chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems and built fraud detection systems for ID Analytics (LifeLock). He has a Ph.D. in computing science from University of Sheffield and 24 issued patents to date. He has co-authored a number of books on big data topics including several published by O’Reilly related to machine learning. Find him on Twitter as @ted_dunning.*
**New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.**
| [DISTRIBUTED ALGORITHMS IN NOSQL DATABASES](https://highlyscalable.wordpress.com/2012/09/18/distributed-algorithms-in-nosql-databases/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | |
| [MAPREDUCE PATTERNS, ALGORITHMS, AND USE CASES](https://highlyscalable.wordpress.com/2012/02/01/mapreduce-patterns/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | |
| [Machine learning skills for software engineers](https://www.infoworld.com/article/3223688/machine-learning/machine-learning-skills-for-software-engineers.html?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | |
\ No newline at end of file
| [DISTRIBUTED ALGORITHMS IN NOSQL DATABASES](https://highlyscalable.wordpress.com/2012/09/18/distributed-algorithms-in-nosql-databases/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | 详细的介绍了分布式的一些实现原理 |
| [MAPREDUCE PATTERNS, ALGORITHMS, AND USE CASES](https://highlyscalable.wordpress.com/2012/02/01/mapreduce-patterns/?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | 介绍了MapReduce的原理及应用并附上伪代码 |
| [Machine learning skills for software engineers](https://www.infoworld.com/article/3223688/machine-learning/machine-learning-skills-for-software-engineers.html?from=hackcv&hmsr=hackcv.com&utm_medium=hackcv.com&utm_source=hackcv.com) | 对于软件工程师来说,是需要学习一些基础知识的 |