Why is there malware?
	Glenn Everhart

The motives for malware existing (and for computer abuse) may lead to insight
about how it might be reduce.

I can see at least a couple such motives:
* financial gain
* avoiding consequences of other actions (such as illegal presence in a country,
	past history of crooked behavior or bad economic practice, etc.)

In general these involve trying to make oneself not get judged according to
one's own past history or other dealings, i.e., getting something or someone
to associate action with a wrong person (who might or might not really exist).

In other words, if you can associate an action with a human author, the presumption
is that there will be less motive to attempt to build fraud software. 

(This is pretty basic. If every financial transaction strongly couples to an
identity of a human, whose history and finances can be queried, how can
the person steal money via transactions? A thief must go back to bare theft
without electronic aids. Likewise, if every time some identity is checked
for, say, starting accounts, getting jobs, dealing with any government, etc.,
how then does someone avoid his past? There can be disturbing side effects
in that old mistakes may never be lived down (recall "let him who is without sin
cast the first stone": everyone has things to forget). Going back to birth is
not needed though for most purposes. On the whole evidence that someone has been honest
in recent years is what should be looked for, not requiring a blameless life.)

There are two components here: 
* Establishing that an identity is a real human connected with a substantial history of behavior, and
* Establishing that a given person corresponds to such an identity.

These all have difficulties.
Consider some issues known in the US.
* Because names are highly redundant (a name that is held by one individual only is
extremely rare), id numbers get used to identify people. (Knowing them is also abused
as a way to authenticate them, another issue.) However, in the US, SSN numbers can be
queried free only to tell whether a given SSN has been issued. The name and birthdate
associated cost extra to learn (a few dollars per number), while the number
space contains under a billion numbers, for a population of 0.3 billion, so that
guessed numbers are likely to exist. There are reports that in collections of SSNs
in US systems outside the government, 15% or so of the SSNs correspond to faked
identities. Some of these are likely illegal immigrants, but the reports suggest
that this is very far from the whole. (15% of the population would amount to
45 million bad numbers, 2-3 times the reported illegal immigrant population.)

That even as little information as the government has is not available for bulk
checks save at high cost makes it hard to ensure that a claimed identity matches
even an existent human being. 

It can be noted too that credit bureaux have not helped here, in that when they
encounter some name and set of records, and see that it matches a SSN that is
already in use, they just create new records (or possibly just drop the second
one). Since they don't know which, if either, of any pair of records are the
person's actual SSN issued by the government, no simple strategy there can pick
a real identity. Since they do not issue some longer identifier which might at
least disambiguate the records, their records are largely contaminated.
(It does not help that the illegal immigrant problem is widely known, so that
thieves can and do select Hispanic names to lull suspicions; the illegal immigrant
who is using a bogus SSN generally will not be a fraudster, but of course
thieves who are looking for a workable masquerade appear and disappear, to avoid
leaving long term tracks.)


Now, it is posible to use other evidence - old tax bills, utility bills, mortgage
or property records, etc., to establish that an identity is genuine. That that
identity corresponds to a person who shows up claiming it is however harder
to tell. Most of this kind of thing is computerized now, and in rather permeable
storage areas. Simply knowing about the information there is not enough, as it
can be tracelessly copied. (It can also be asked for; people often just provide
such information, and much public information is freely visible to all. Look at
mortgate loan records in county government sites for example...) You can also buy
fake ID documents, or software to create them.

Enrollment problems (that is, initially determining that an identity is real
and not faked, and that it has something to do with the person who shows up)
remain hard. Ultimately asking for several pieces of evidence AND waiting for
a history with you or your institution to be developed before extending full
trust may be the best that can be done. Clearly the stronger the evidence
you can get, the easier it is to deal with this. (Some help from SSA or in
other countries, similar agencies, would be most useful.)

Once this is solved, how do you handle repeated instances?

I would like to suggest that pure information is unsuitable for this, because
it has no inherent way to prevent copying or theft. If when you have
someone enrolled you give the person a token, the token, being material,
cannot be so readily copied. If it is designed well, it can be made hard
to duplicate also (and will need to be). If you decide to use a biometric,
like maybe a fingerprint or perhaps DNA sample, that is hard to duplicate
as well, but for it to mean anything, you must know somehow that it came from
the human. We leave DNA and fingerprints all over the place, and if
someone shows up with a fingerprint for example, if it's not taken directly
from a living finger, there's no way to know where it came from. Some
biometric systems do attempt to ensure liveness, and can reduce the risk
of such failures as may occur with stolen prints, iris images, or DNA.

In effect, the token (with unique key inside and tamperproofing) or the
biometric system using lots of information to help ensure liveness, give
greater assurance that a repeat encounter matches an old one.

This kind of measure can be inexpensive - a few dollars for a token (in large
deployments) is feasible. EMV cards have reached a low price point. Liveness
systems based on smart phones are approaching usability (provided charges
for data transfer can be kept from spoiling the system's usefulness).

However, for this to reduce malware incentives, such measures must be pretty
much ubiquitous and required by all institutions offering services that
are involved with the motives. That means government, merchants, financial
institutions. If any one group required such, the others might follow.

It should be noted that government has a poor record here because it is
not subject to as much loss as commercial areas. Regulations notoriously
are behind the times (consider that the US government had DES as a
crypto algorithm standard, until a design for a cheap cracking machine
that could crack any DES code was built and published, despite years
of warnings that the code was too weak). A news report I just read claims
~1.5 billion per year in card fraud exists. That gives the commercial
groups some incentive to avoid it.

Take away the motivation for gain by malware, and might we not hope that
less malware might be written? The savings from having fewer clever
cracks to avoid could be huge.

When we think of tokens given out, it must btw remembered that such things can
be vastly abused. The same can be true of biometrics. If authenticating is
not a conscious and voluntary act (think of signing something) it can be done
to track people and allow no privacy. That is, if we suppose everyone gets
a remote readable ID gadget implanted (as a for instance), it becomes
possible to track anyone's actions to whatever degree of detail an authority
might want. That amounts in practice to the ability to arrest and punish
anyone at will, since there are too many (often obscure) laws and too many
chances for anyone to infringe some of them, regardless of the benignity
of the person's intent. (Let those who doubt consider how they feel when
being followed by a police car while driving, even though they are driving
legally as far as they know.)

Also any single identifier handled at a single place is an attack target. 
While we want to be able to know that a person is real and has a history,
on the whole being able to have several ways to authenticate someone,
each able to show "this is the person that Known Person(s) X have dealt with
for time T", for various X and T, gives a more robust system, in that
a loss of one X's system can be worked around with others. Also that kind
of assertion does not require knowing everything about someone's history,
permitting a tie-back to someone's behavior or reputation without encouraging
so much surveillance that any mis-step a person makes might ruin the person's
life thereafter, often at the discretion of whoever is doing the surveilling.

These considerations tend to mitigate against use of biometrics in many
cases, since there are too few such available and most of them must be
kept for decades, so that separating assertions becomes hard.

A token based system needs to be able to authenticate user and authenticator
to one another in a way positively present to both, and needs to be
able to show that elements of a transaction are accepted by both parties
as well. It must not rely on prodigies of human memory (lest it fail;
people don't remember "secret facts" well unless they use them a lot). It 
may however use a protocol with the user as part of its action, both
to simplify the machinery and to ensure the authenticating user is
consciously performing his authentication operation. (This presumes that
the trust granted by this operation has been set corresponding to
what is supported by experience with this user. A system key for the user
to authenticate with should grant keys to the kingdom only once you have
been convinced by history (and perhaps other tokens and testimony of their
operators) that the key should give such power.)

There are various kinds of simple puzzles a human can do which can make
for elements of good protocols; selection of elements in a spatial pattern
is one example of such. I will remind readers that a human selection
of some digits in some positions of a display of pseudo-random digits
in effect can give an encryption of the selection (so long as the selection
is done by eye and all that can be observed by an adversary is the
selected (few) digits entered elsewhere. This kind of cipher-lock scheme
is used for car doors and seems easily learned.

There will be a temptation to say "let's build some smart tokens then, and
let one set of hardware do for all so people need not carry a bunch of them
around". (This, rather than having a token that is small and shaped to be
simple to carry a dozen around without hassle.) The suggestion for such
needs to be viewed with caution, in that you don't want an attack on
one token to become an attack on all. Nor do you want to enable side
channel attacks (especially when carried out by malware that may be wired
to the "token hardware" and able to do precision power or timing measures
unbeknownst to the user). It is also important as a design feature point to
recall that the human element in using an authentication system will
be strongly biased to using the same methods (and, likely, secrets) for
each case. Whatever can observe that can be part of an attack. Also the
device to be used is under control potentially of an attacker. Don't
design a system that can depend on anything in the hands of an attacker
being uncompromised.

The elements of a good transaction authentication in a way have been known for
many years. A simple bank check, as used, say, about in 1960, was a decent
example. The document had the date, payee, amount (twice so it would be gotten
right), payor, and a signature which in those days would be recognized by
banks (which were much smaller then and would generally know their depositors
personally). The check was only good at one bank, and the auth of the signature
(and indeed the having of blanks for that person as a supporting piece
of evidence) showed the bank that it was a genuine document. (The scale
of banking made it easier then to contact someone if a very unusual check
were seen too.) Checks did get altered, but the need to alter amount
in two places, and sometimes paper which made erasure obvious, make this
not trivial to do. The signature showed (generally) acceptance of the whole
transaction (including payee and amount). 
Likewise transactions with government tended to rely on signatures covering
many parts of an operation. (They still do; you have to sign your tax
forms, for instance.) There's not much doubt about identity of the 
government most of the time. There is getting to be doubt the other way
though, with the widespread corruption of SSN and other databases with
forged information. I will note that some of the strange cases about people
appearing on "do not fly" lists might have to do with the low integrity of
US federal lists of names.