Dreaming the Limits of my Faith

2016.12.23, 2017.04.24

Warning: this post contains semi-graphic content just as I dreamed it.

I had a dream this morning, a half hour before my alarm was set to go off. In this dream, one of my children had a minor medical problem. I don't know what the medical problem was, but I think the child was Becca who, in some ways, is the most innocent of my children compared to her age. We turned her over to the care of medical doctors who did their diagnosis work and planned for a procedure to help her recover.

At this point, I must pause and provide some mental context. I am fascinated by how far the medical profession has gone. Every decade they are able to do more, to cure more. They are able to remove body parts, replace body parts, open up the human body to repair deeply hidden body parts. Sometimes it seems they have nearly reduced medical treatment to an engineering problem. But, I am one who has thought long and hard about things such as the teleportation depicted in the humanist and reductionist Star Trek. I have decided that if science could some day learn how to transport a person's physical body through the air in the form of a stream of particles or in the form of data, it would certainly kill that person. I believe that science and medical practice will never fully reduce the human soul to a human body-machine or to data that can be represented by a physically coded signal.

In that context, you should be as jarred by disbelief as I was when the doctors in my dream showed me what they had started to do with my little girl. For the minor medical problem she had (I don't know what it was), they found it convenient to completely dissect her. Not only was she dismembered, but her various types of organ were separated from each other and placed carefully in different compartments of a type of crate lying on a table. It looked like a small meat display case at a grocery store. There were no coverings, no preservative liquids. There was nothing remaining intact that looked like it belonged to a human child.

I instantly realized -- as you can only do in dreams -- that they found this dissected state of the human body easier to work with. They would affect a cure in whatever part seamed best and then they would put it all back together. I was horrified and repulsed by this infinitely precarious situation and immediately woke up, unable to sleep any more.

What am I to make of this dream? I know some people who have that level of faith in human reasoning or medical skill. They believe medical science will some day allow doctors to reconstruct a human body like a Frankenstein's monster or a robot. Apparently they live in the world of my dreams.


Some Notes from AAAI-16 in Phoenix

This is a summary of what I gleaned from the academic AI research conference AAAI 2016 in Phoenix, AZ.  There is some good stuff here for machine learning users. There are also a few famous AI names here and there.  I will bold words to help you skim.  I did not take notes during all the talks, just the tutorials and invited talks that I found engaging.  I went to standard research talks on game theory, reinforcement learning, NLP, etc., but did not take notes.  I even met and talked with a few people—one about ad tech, one about social diffusion, and a few others briefly.  For the most part, I was a nerd among nerds, a loner among loners.

Tutorial on CP-Nets (Conditional Preference Networks) by Thomas E. Allen, Judy Goldsmith, and Francesca Rossi

Using CP-Nets is a way to model combinations of preferences in a more-tractable way.  Instead of creating a total-order among all possible complete situations or states, they decompose each state into variables and orient those variables as nodes in a directed acyclic graph (DAG, just like a Bayesian network does).  These nodes express which decisions or aspects of the whole state are most important given the "parent variables", i.e. the decisions made already at the more important nodes.  Each child node in the network contains a table of preferences among the values of its variable given all possible values of the parent variables.  This is a qualitative ranking of preferences, not a quantitative relationship.

I also went to lunch with these guys.  Very friendly and inclusive.

Deep Learning by Reza Borhani, Jeremy Watt, and Aggelos K. Katsaggelos

A while back, an important problem was function approximation, and the technique people used was to learn a mixture of fixed basis functions.  There are a number of types of fixed basis functions that are popular.  For frequency (Fourier) type data, you might use a set of sinusoid functions of different frequencies.  For other kinds of data, you might use a set of polynomials of different orders (powers).  There are many others.  A good thing about fixed basis functions is, they have convex error surfaces, which means you don't have to worry about falling into local optima (weight settings that terminate search but which are not globally optimal).  And yet they have other problems, and it turns out that the local optima problem with non-convex error surface of neural nets is not really that big of a problem.  The problem with fixed basis functions is, they are not very efficient.  You need to combine a lot of different basis functions from your class to approximate an arbitrary function very well.

A neural network is a set of feature basis functions for the purpose of function approximation.  They overcome some limitations of using fixed basis functions.  In a sense, neural networks "scale" better than fixed basis functions because they require a smaller number of features or basis functions than the fixed approach.  tanh is a good basis function for neural nets because it is adjustable (a lot like the logistic function which is what I learned in school).  That kind of adjustable basis function has been popular for a while, but recent improvements have given us a number of new basis functions that actually perform better than tanh and logistic (sigmoidal).

Training data quality is a spectrum from ideal to bad.  Ideal training data is characterized by two things: (1) it is not noisy and (2) it is evenly distributed over the input feature space (and he probably implied also balanced in output space too) which means no empty areas.  If your training data is close to "ideal", then it does not matter which basis function you use.  If not, then select your basis function wisely.

Gradient descent is equivalent to the back-propagation algorithm in this context.  Traditionally, you take one big step in weight-space after a full scan of all the training data.  Stochastic gradient descent is more incremental (less batch-oriented) in that it takes small steps, one for each datum in the training data; and it works better than batch updates.

A theoretical note with practical application: the target function must fall within the "span" (or space, think linear combinations of basis vectors in linear algebra) of the features or basis functions.

Neural networks are popular now because we have more data and Moore's law.  The recent big advancements in neural network research were inspired by biological neural networks, e.g. human brain and perceptual systems.  Convolutional neural networks use more biology knowledge in terms of layers of dimensionality reduction, function composition (or decomposition), and structure.

Other recent advances in neural nets include the following: (1) new choices for activation functions beyond tanh and sigmoidal, e.g. something he called "max" converges much faster requiring fewer iterations of learning, (2) better ways to initialize weights, (3) theory advances showing that non-convex error surfaces are not so bad, (4) new optimization processes, (5) new regularization strategies.

For those of you who are doing deep learning with normal neural networks containing a single layer of hidden units, here is a definition for you.  A Deep Learning model is defined to have more than two or three hidden layers in a neural network.

Diffusion in Social Networks by Paulo Shakarian

This lecture was a survey of the field of research that models how ideas spread in social networks, with applications to marketing and "going viral".  On a high level, there are two approaches: One, you could construct a detailed graph (network) model of the social network and run simulations to see what happens.  There are many competing theories and none of them look detailed enough to work very well.  This approach is better for explaining what might happen in the network than actually predicting diffusion.  Two, you could use machine learning to more directly learn to predict how far ideas might spread based on a variety of feature of the phenomenon, including the early stages of propagation.  This approach does better than the first approach at prediction, but neither approach works as well as you might hope.

Here are a few sketchy details.

Questions to answer:  How do thoughts spread in a social network?  How many infectees after T time?  Influence Maximization Problem.  Profit Maximization.  Takes theories from various fields like mathematical sociology.  Perform simulations.  The first iteration of the simulation is most sensitive to out-degree and probability (beta) of transmission.

Models: SIR and SIS models, tipping model and target set selection, Kempe-Kleinberg-Tardos (KKT) framework, logic programming based diffusion, evolutionary graph theory (EGT).

SIR model: Susceptible means able to be infected.  Infected means able to infect.  Removed means not able to infect or be infected.

Tipping Model: Polynomial-time infection-count prediction.  Non-submodularity.  Algorithm for finding a seed set that can spread infection to the whole graph in minutes with millions of nodes and edges.  Linear Threshold: uses probabilistic thresholds.

Kempe-Kleinberg-Tardos (KKT): Independent Cascade (IC) Model allows different probability for each connection. Transmission is from sum of weights. Live-Edge (Live-Arc) Model.  Generalized Threshold Model.  Logic-based Diffusion: Annotated rules for diffusion. Write an "annotated program" in logic, based on attributes of nodes and relations, (including probabilities? Maybe not).  Like deterministic tipping model.  Computing fixed point (maximum spread), which is polynomial in time and space.  Multiple labels on nodes and relations.

Evoluationary Graph Theory: Mutants taking over.  Moran Process.  Goal to compute fixation or extinction probability.  Isothermal graph.

Data-driven approaches: Learn model parameters from past diffusion processes.  Regression model to predict spread from features of past diffusion processes.  EM process hard to scale but accurate for early stage diffusion (non-viral).  Tends to over-fit using small influence probabilities (10^-7).  Temporal effects: exponential decay of influence is slow.  Discrete time model is a good compromise.  Dynamic models are better, Bernoulli model better than Jaccard.  Discrete time comparable to continuous, which is nice.

Credit Distribution Model: Fast runtime for Influence Maximization.

Power-law of cascades, large cascades are rare.  Features that are predictive: seed user's number of followers, number of friends, past local influence; content's tweet length, hashtags, divergence of sentiment; early diffusion's number of out degrees in subgraph, etc.  Regression trees, 0.98 R^2, but poor final result.  Logistic regression with balanced data got good results.  Original poster and content becomes less important as initial cascade progresses.  Number of influencers (incoming edges) has less effect above 3. Number of influencing communities has a greater impact.  Structural diversity is a good feature.

His startup: CrossViral

I also had a good chat with Dr. Paulo a few days later.

Inference in Structured Prediction by Dan Roth, Kai-Wei Chang, Gourab Kundu, and Vivek Srikumar

This was very related to my research area in school, so a lot of this was review for me.  Dan Roth was one of the few names I recognized at the conference.  I have to agree that structured prediction problems are among the most important (and still very general) prediction problems.  They include NLP, information extraction, and image object recognition problems.

Integer Linear Programming (ILP) is cool and very useful and popular these days in ML and AI.

Reading the Web

This talk was very interesting and also related to many of the papers I read in preparation for my PhD dissertation, so a lot of this talk was a review for me.  He talked about bootstrapping (the unsupervised and highly scalable process of learning extraction rules from extracted knowledge and vice-versa) and the strategy of using multiple constraints or sources of evidence to prevent semantic drift.  Tom Mitchell's Never Ending Language Learning (NELL) project and Orin Etzioni's Web-reading work are the more famous projects in this area.

I saw a slide with Justin Betteridge on it, who I already knew was working in the NELL group at CMU.  I new Justin during my masters degree at BYU.  Very nice guy.

How AI influences Labor Market (Panel Discussion)

Random bits of pessimism and optimism and ego-boosting comparisons among panelists (mostly Oren Etzioni) interspersed with a few random but interesting line plots and quotes.

Google DeepMind (Invited Talk) by Demis Hassabis

DeepMind was sold to Google for $500M.  Demis Hassabis and his company have a very impressive background.  He has been planning DeepMind for 20 years, and intentionally worked in the best of academia and startups as preparation.  Their goal is to "solve intelligence", by which catchy phrase I believe he means to "solve the impossibly hard problems necessary for creating a truly general-purpose artificial intelligence."  Strategies they adopted early on were (1) learn from raw inputs like the pixels on a screen of a video game, (2) be general not brittle and narrow.  As an counter-example, he mentioned that DeepBlue was narrow though still impressive because it would have to be completely retrained to play anything other than chess.

They are known for deep learning, but they are by no means only about deep learning.  They prefer thinking in terms of reinforcement learning.  This involves learning statistical model of world, then learning actions.  They believe that solving reinforcement learning would solve general learning and intelligence.  They believe in "grounded cognition".  Robitics hardware is expensive, slow, and it breaks.  Therefore they use simulators which can make as much training data as you want, and they ground themselves in pixels of video game images.  They prevent testing bias, but I forget how.  They can run tests on thousands and millions of agents at once.

They use Deep Reinforcement Learning, which to them means getting reinforcement learning to work at scale.  Their early task was to learn how to play 8-bit games from 80's, Atari 2600.  DQN.  Code released and discussed in articles in Nature.  General purpose game learner.

They draw inspiration from brain and neuroscience.  Their system is a Neural Turing Machine: ANNs with memory.  It is a full Turing Machine. AlphaGo is their new project.  It is based on pattern-recognition, planning, tree search, Monte Carlo Rollouts.  aGo was the first program to beat a professional Go player on a full sized board.  They have hundre Alphds of papers on their website.  Next month, in March, they will live stream on YouTube AlphaGo playing a big pro, Lee Sedol. This will be a $1M 5-game challenge match.  In preparation, they are retraining everything.

Difference between AlphaGo and DeepBlue: AlphaGo is a general-purpose, modular system combining deep learning based pattern recognition with planning and search.  They will later remove supervised learning; training time will be longer.

AlphaGo is more human-like.  Human players say AlphaGo plays like a human; so they say it passes a "Turing test" for gamers.

Montezuma's Revenge is hard for AlphaGo.

They used 100 GPUs before being bought by Google while working on Atari games.  This was a response to a question about how to do this kind of research on a small budget.  He says it's a good domain for small budgets.

Extra Bits

I had fun watching the ASU robots do their thing in the foyer between research paper sessions.

One of the posters in the poster session was about learning to automatically generate posters from scientific papers.  I asked her the obvious question.  No, her poster was not automatically generated in toto.  (One of the figures was.)  But she did get some good attention and chuckles when she introduced her topic during her spotlight talk.

Here is a web app (and a university research project) helping people making "fair" decisions involving finite resources: http://www.spliddit.org/

It was interesting to eavesdrop on Peter Norvig and someone else I did not recognize who were talking about "academic genealogy" and their "shared ancestors".  It seams that the better your academic "pedigree" is, the more you worry about academic pedigrees.


Various Artworks

Here I collect a few random pieces of my visual artistry.

This drawing tells me what I want to do and be in the future.  In particular, it tells me two things: I wish to draw pictures with words as a writer, and I wish to fade into the background as the words of this parable is the real message.  This drawing was done in pen-and-ink on paper by hand before being scanned into computer and color added.

My first attempt in many, many years to combine computer science with art.  I call this a cuneiform binary fractal tree to 19 levels of recursion.  If you have Googled for fractal tree, you will have seen similar trees.  What makes mine unique, I think, is that it is entirely algorithmically generated and yet it has a semi-natural shape by my adding a small amount of uniform-random noise to the branching angle and also choosing a random order for growing the left and right child branches at each branch point.  I think it strikes a balance between too regular and too irregular.  I also like the fact that it is composed entirely of such a simple shape: triangles.  Oh, and I also prevent overlap among all branches in the tree by keeping track of points on lines and pruning any branch before it crosses an existing one.  You must understand, my intention is to refine the two-dimensional appearance, preventing most or all appearance of depth.  I could do more.  I plan to leave more space between branches to prevent the crowding-look.  I could also add a little noise to the branch length.  Needless to say, there are more trees to come.


Friendships and Communities are Personality Amplifiers

Friendships and communities are personality amplifiers.  Never do we feel as justified in our beliefs and actions as we do among others who have propensities or tendencies toward common beliefs and actions—no matter whether those beliefs and actions are good or bad.  The unconscious feeling of justification we gain from each other inevitably leads to a growth in those feelings and actions within the individual members of the friendship or community.  So, the reason I say that community and friendship amplify personality is not just a matter of that personality being recognized among a larger number of people.  It truly grows within each person as a consequence of being validated by the others in the group.

The example that prompted this thought is my two little boys.  They are two years apart: one is three, the other is five.  They are the best of friends—it seems they do everything together.  When one wants to act like he is a puppy by crawling around on the floor with his tongue hanging out of his mouth, the other will always follow.  I have a third boy who is 11.  He was not given the luxury of having a male sibling close enough in age that I would call them "friends".  I can now see how personality is amplified by friends.  The older boy might have gotten into a small amount of trouble once in a while on his own, but the two younger boys are constantly out of control when they are together.  I watch them carefully.  If one of them takes offense at my denying him an extra cartoon and he decides to express his being offended with weeping and wailing, the other—not quite instantaneously—follows suite.  And this pattern goes for positive behavior as well.  If one boy—miraculously—decides that it might be fun to obey his parent's wish to do some cleaning, the other boy will go right along with the game.

I hesitate to complete my thought, though it is such a natural progression.  I hesitate knowing the pain out there in the world caused by beliefs and actions that were amplified beyond all rational proportion.  Terrorism is a community first, and a force of destruction second.  It makes little difference whether that community is a local community or an online community.  Yet, at the same time, community can amplify the most positive beliefs and actions.  This is why it is so important to have a church, a synagogue, a mosque—any community built on positive beliefs and actions.  If we are to ever hope to progress, morally, spiritually, ethically, as far as we are capable, we must live in a community of like-minded people—and interact regularly.

Football vs. Funerals

Today I attended a funeral.  Call me morbid, but I prefer attending funerals to football games.  That is, you can try to call me morbid, but you would be mistaken.  Are funerals morbid?  Not at all.  Just like a well-written obituary, there are few thoughts as rewarding to hear or read as those lessons extracted from the perspective of an entire lifetime well lived.  The speakers at funerals speak about the best and brightest moments of life.  They know how to put things into perspective.  They speak words of comfort when comfort is desperately needed.  Those same words may help you in the future when you least expect.

On the other hand, someone could make a case for football being morbid.  At the same time I was attending that funeral, I am sure someone, somewhere, was attending a football game and thinking, "This sure beats attending that funeral my wife/mom wanted me to go to."  What is football other than a thinly veiled reenactment of war.  The two sides line up on their field of battle dressed in armor complete with coats of arms, ready to do their best to punish the other side for having the audacity to exist and breath the same air.  One side makes an offensive move while the other side takes a defensive stance against it.  They take turns doing this until time runs out.  Bodily injury is common.  In fact, there are nearly one injury per player per year in the USA.  (https://en.wikipedia.org/wiki/Health_issues_in_American_football)  At the end of the game, what do the fans leave with?  The fans of the losing team feel badly for something they had no control over, while the fans of the winning team get all inflated with pride based on just as rational a reason.

On the battlefield of life, when faced with the question of football vs. funerals, there is no contest.  I will choose the funeral every time.  Why?  Because I love life.  On the other hand, depending on how the teams play, sometimes we are lucky enough to witness both a football game and a funeral on the same day.

This could spark a little controversy.  Just note that I do not have anything against people who play or watch football.  Two of my favorite people in the world loved watching football (my dad and my grandpa).  I am simply trying to relate an interesting comparison that popped into my head today while waiting for the action to start.


Drawing Megan

This is the story of how I drew Megan.  Sorry that most of the photographs were taken by my extremely cheap phone camera.

There once was a beautiful young woman who got married to a nerd.  Before the wedding, there were pictures taken of the beautiful young woman.  The nerd has ever since been especially fond of one picture in particular.

So I decided to use this picture as the basis for a large pen-and-ink drawing.  Here is the beginning of the initial pencil drawing of the main elements.

Here is the completed pencil drawing of the main elements (darkened so you can see the pencil lines better).

I had originally planned to draw a plain cross-hatched grid for the background, like you can see in traditional engraved portraits.  Then I realized that I should fully execute my plan for practicing techniques to put in future illustrations by adding something more fractal-like.  What I needed was a fractal-like curve that had similar properties to a grid: it filled the space evenly with a uniform line thickness.  It did not take me long to discover the Hilbert Space-filling Curve, or Hilbert Curve for short.  What did take me a while to find was an existing image of it pre-rendered at a sufficient level of detail to cover my entire picture.  But then I had to draw it without being able to trace it.  That required more time than it took to find it.

Next is the Hilbert space-filling curve print-out that I traced with pencil on the print-out, itself, as I copied it line by line to my large drawing's background on a pencil-grid that I had already drawn as a guide.  I drew a few tiny line segments at a time.  At one point I discovered that I had drawn a section of the curve (maybe a few hundred line segments big) in the wrong place—it was shifted by one unit of the grid.  I had to erase that part of the background and redraw it.  Thankfully I had first drawn the curve in pencil before inking it.  I remember doing much of the inking of the Hilbert curve while I "watched" the extended Hobbit movies with my family.  Actually, I also associate this pen-and-in portrait with a number of fascinating audio-books that I listened to while drawing it, including biographies of C.S. Lewis and J.R.R. Tolkien, The Wood Beyond the World by William Morris, Phantastes by George MacDonald, a lecture-summary of the book "Goedel, Escher, Bach", a long series of lectures about life in the Middle Ages, the extended Lord of the Rings movies, and several others.  One can listen to a lot of good material in about 150 hours.

Here is the final drawing sitting on my drawing table.  You can see my lap top plugged in to my big Altec Lansing speakers in the background ready to play me another audio-book.

Here is the final pen-and-ink drawing professionally digitally scanned at Replicolor in Salt Lake City, Utah.

Here I have digitally made some minor corrections to the black and white line drawing and have added color.  I consider this the final version.  This is the same version I have listed in my American Frame Art Galary.  The coloring is subtle—on purpose.  On my laptop, I can barely see the colors, while on my desktop they are easier to see.

After finishing the color version, I tried out some of the filters that come with the free image editing tool called Gimp.  This image shows the mosaic distortion filter.  I love the blending of the colors in tiles around the top of the rose, including the brown of Megan's hair, the pink of her lips and the rose, the subtle blue of the background, and the green of the leaves.

This image shows double-thresholding—that is, mapping light and dark greys to black and middle-tone greys to white.

That about wraps it up. 


What Does the Blog Say?

Hello blog-lovers, who also happen to be parody-lovers, and who also happen to be "The Fox" lovers, who also happen to love commentary on digital and social media.  Yep, I knew that included all of you.  I proudly present a refrain that woke me up in the middle of the night, last night, as I lay in bed, unconsciously merging two things I had been "listening to" the evening before. 

"The Blog"

Pandora goes woof, YouTube goes meow.
Twitter goes tweet, and FaceBook goes like.
Flickr goes snap, MySpace goes croak, and LinkedIn goes toot.
WebMD says quack and eBay goes bling, and CNN goes OUCH OUCH OUCH.
But there's one sound that no one knows...



I'll finish this later.   

If anyone would like to collaborate to finish this, perform it, record it, let's talk.