VLANs vs Subnets

VLAN vs Subnet


Do you want to restrict traffic at layer 2 (switch - VLAN) or layer 3 (firewall - subnet) or both?

Do you need to cut down on broadcast noise (VLAN)?

How much overhead do you want to manage? 

Most VLANs are tied to one subnet so you typically will see subnets with out VLANs but not the opposite.

Related Links:

Extensive Q & A - says VLANs more secure because not based on IP.


VLANS and Subnets - 10 things you need to know 


VLAN vs Subnet - says VLAN can be hacked but does not expound. Includes configuration.


Interesting discussion of segmenting traffic at Layer 2 and 3 - pros and cons of VLANs vs. subnets. Understand your network traffic.


Java JIT Optimization

Write short methods for inlining.

Avoid polymorphic dispatch on same call site - putting different types in a list and call a method.

Keep small objects local to aid in escape analysis

Use intrinsic methods. (putLong, etc.)

Inspect inlining - PrintInlining, PrintAssembly (yeah...we all know assembly....)

Secure Java Programming

Notes for secure Java programming

Normalization - convert all data to common format before checking input since the same character can be represented by different codes in different languages and character sets.


Code injection - injecting commands into exec statements in Java, for example.

Identification - who user is

Authentication - verify person is who they say they are. Make sure you are using a reputable, solid JAAS implementation

Authorization - verify user is allowed to perform selected action. Many options: again make sure solid source and reputable vendor, well tested over time.

Output encoding - let everything come into your app, then validate and make sure data cannot be executed as code when submitted to any other process. Encode special characters for the output context.

Blacklisting - many ways to bypass. Not best approach. Characters in different languages, character sets.

Whitelist - only accept known good characters. Hard to do because can break functionality - such as when checking names or passwords that may have special characters.

ReDoS - regular expression denial of service attack.

- Deterministic regular expression engines try to match one time

- Non-Deterministic regular expression engines try multiple times - can craft inputs that take systems down. Beware of repeating groups and make characters required if possible.

Use parameterized queries wherever possible to prevent SQL injection

Can use Apache XML encoding library to encode XML data, for example

Jdgui - Java decompiler

JavaSnoop - debug Java code without the source code. Defeats obfuscation tools:


Consider implementing sensitive code in C/C++ and call using JNI. Allows advanced obsfucation, anti-debug, anti-tamper tactics.

Obsfucation and compiling to native code (Excelsior JET, GCJ -- may not be production ready) can make it take longer to get source.

Don't trust client. Even if behind firewall. Some sources report over 50% of breaches are by accidental or intentional insiders. Phishing and social engineering can cause user machine to be hacked and used by an attacker.

Validate XML against XSD (javax.xml.validation.Validator) or DTD (javax.xml.validation.Validator

XML injection - alters XML to invalid format.

XPath injection - XPath concatenation can allow querying additional information. No such thing as parameterized queries for XPath.

External entity attack. Use of word SYSTEM goes external file doctype, entity which can include malicious entity replacements, try to read system files or create DOS attacks.

XML entity expansion attack. Recursive entity replacement to create huge payload what parsed even though initial input is small.

To prevent XML attacks can configure XML parser to disallow vulerable features.  Disallow doctype declarations, turn on FEATURE_SECURE_PROGRAMMING. If you require external doctype declarations write a custom entity resolver - this is complicated. Static analysis tools may check for this.

Path Manipulation Attack -unauthorized access to files by breaking out of current path or accessing a file via absolute path. To prevent verify you are in expected directory before proceeding.

Use Java Security manager to limit what code can do. Run your application with the security.manager flag and specify policy file.

Temp files may be pre-created with malicious content. On Unix systems create a symbolic link. Java now has ability to set file permissions.

java.util.random - predictable

Java 7 introduced several new methods in the java.nio.file.Files can create file names with unpredictable names. Also will not create a file if file with that name. Also has shutdown hook to delete temp files. java.io.File.deleteOnExit()

In Java 7 try statement can have objects that implement closeable interface which will be automatically closed on end of try.

Java 7 has a multi catch block to handle multiple types of exceptions.

Race conditions - timing issues while accessing resources causes data to be invalid.

TOCTOU - check a property on a resource then someone changes that resource before the resource is used.

Java Files API has TOC/TOU issues. Checks file properties, then properties could be changed before the file is used because the files API works on file paths. Using the File object does not have this vulnerability because it gets a handle on the actual file.

Deadlocks can cause DoS attack.

Encryption problems: skipping cert checks, no hostname check, no certificate validation, no certain in truststore

HTTPSUrlConnection in Java does this for you - other libraries do not

Chef, Ansible, Puppet, Salt

Articles comparing Chef, Puppet, Ansible, Salt


Usage Stats

Ansible beats salt on security 

Ansible vs Puppet, Chef

A search of the Mitre cve database shows some pretty substantial vulnerabilities in salt, most in Puppet (but is most widely used and been out longer), least for Ansible:


Ok after all that I lean towards Ansible but need to try it. I like the idea of using a language popular with says admins vs. a customized language. The model of agent-less appeals more from a security and administration standpoint. Agents can't be hacked if not there. Push vs. pull can get changes out more quickly. This - having not yet used the tool. But I also know the AWS kids at Amazon use it and love it.

Here are some interesting ideas to try:

Fixing HeartBleed with Ansible

Secure MySQL with Ansible

Ansible SSH security considerations

As noted in previous posts I am interested in storage of keys separate from data as number one problem with encryption in companies today. Earlier this year Ansible added a vault feature. Will be interesting to see how this works and if facilitates this separation.


Web Security Vulnerabilities

Same Origin Policy: ability for web browser to restrict scripts from accessing DOM properties and other methods of another site.

JSONP - opens up a lot of risks. Recommend not using.

DOM based cross site scripting

Cross domain messaging

Stealing data from web storage

Risks introduced by HTML5 elements and attributes (Video, Auduo, Canvas, Geolocation)

Architectural Flaws, Implementation Bugs

Can run into buffer overflow with unmanaged code in C#.

XSS - JavaScript can be injected by many different tags: video, script, etc. Insert JavaScript into form, URL or submit malformed request straight to server to direct response data to an alternate site to steal data. If can get a user to login to a malicious page can steal credentials and session IDs.

Input encoding - attempts to block certain characters with white lists, black lists, exact match, semantic rules.

Output Encoding - may be preferable to input validation. This tactic allows entering any character but encoding problematic characters so they won't be interpreted as executable code. There are common encoding libraries but some are not suitable for production. 

Output encoding should be done for any user or 3rd party input in HTML, CSS, JavaScript, URL, etc.

SQL injection: insert SQL into web inputs to run arbitrary SQL code against web database. First step is to insert a single quote. Is site is vulnerable will throw an error. Check version, etc. To get database type. Then query system tables, columns. Then execute random SQL. Not always that simple but that's the gist of it.

Session Vulnerabilities:

Session Fixation: change session ID after login

Session Prediction



Eavesdropping: Fireshoot plugin - keep session URLs on SSL.

Cross Site Request Forgery - making a request to another site, which is different than XSS which injects code into a request. Example - loading an image from another site would include information in cookies from the image site which could be used be the site including the image link. So for example, if someone is logged into a site and an email is sent including an image but the image includes a malicious command, the cookies are included when the user views the image and will allow the malicious action to occur. To prevent: #1 have to prevent XSS. #2. Tokens per session, page, form. Not in a cookie, tied to session.

When using an iFrame can set sandbox properties so no code in the iFrame can affect the page embedding the iFrame.

Client side validation can help reduce load on the server but should never be used for validation because client side validation can be bypassed by web site visitors.

TamperData Firefox plug in alters web request submissions.

Set autocomplete = off in input fields.
For Web Storage introduced in HTML5 store sensitive data in session instead of local so is not persisted. 


Indirect Reference Map - map fake data to real data and only send fake data to the browser and map it back to the real data for server processing.

LSASS system service runs on Windows. .Net apps can use it to encrypt values and only use the encrypted values in memory.

Wireless Access Points, PEAP and Radius Servers

Started looking up what it takes to use PEAP with wireless access point.

There are a bunch of parts and pieces need to put together...

RADIUS protocol ... RADIUS service on a server to auth.



Note that is you use an EAP solution that incorporates a vulnerable version of SSL you will probably be subject to HeartBleed attack.

ARP cache entries - view, modify, secure

The following links go to commands to view and modify ARP cache on a machine. In order to prevent cache poisoning you might want to prevent gratuitous ARP by forcing MAC addresses for various machines.





Guard against gratuitous arp vulnerabilities for VOIP phones


Cisco document with details about gratuitous arp

Decoding IP Header - Example

Let's take a sample IP packet header and see what's in it. Here's our sample random IP header pulled out of WireShark traffic:

45 20 01 b4 96 25 40 00 39 06 60 6a 5d b8 d7 c8 0a 01 0a 13

A packet is between 20-60 bytes and a length greater than 20 means we have options. So how long is this packet?

Each hexadecimal character is four bits and 8 bits = a byte, so every two characters is one byte.So let's count the bytes:

45 20 01 b4 96 25 40 00 39 06 60 6a 5d b8 d7 c8 0a 01 0a 13

Ok looks like we have a 20 byte header so there are no options.

We'll need a couple things for our translation -- the cheat sheets in my last post to convert hex to binary and decimal:


Also the layout for the IPv4 header in this post which tells us the purpose of the hex values in the various positions:


Byte 1 (45)

The first two numbers are always the version and header length.

4 in hex = version 4  (IPv4) which is the default version.

5 is the length. 5 in hex = 5 in decimal and it's a quad number so we multiply by 4 to get our length = 20 (confirms our analysis above).

Byte 2 (20)

Byte 2 is 20. This is Type of Service. Going to skip this one for now as most routers ignore

Bytes 3-4 (01 b4) 

This is the datagram length (header + payload)

So the binary version of this, using our cheat sheet in prior post is:

0 0 0 0 0 0 0 1 1 0 1 1 0 1 0 0

We've got 1's in positions: 2, 4, 5, 7, 8

We grab the decimal values for these and add them up:

4 + 16 + 32 + 128 + 256 = 436

Yep, that matches up with Wireshark so cool.

Bytes 5-6 (96 25)

This is our unique id - it should be a random number so not going to bother translating this one righht now. Might be important if you want to verify randomness.

Next 4 bits - flags (4)

We need to turn this value into 4 bits to determine our flags.

Binary version of 4 is:

0 1 0 0

That means we have one flag set - 1 indicates datagram may be fragmented, however the next bit indicates no more fragments exist.

Next 12 bits (0 0 0)

This is our fragment offset. Although the packet says it may be fragmented, the flag to indicate no more fragments was set as noted and the fragment offset of this packet is 0 so seems like there is only one packet.

Next byte (39)

Next byte is time to live.  39 in hex translated to binary:

0 0 1 1 1 0 0 1

We've got values in positions: 0, 3, 4, 5 - grab the decimal values:

1 +  8 + 16 + 32 = 57

Cool - matches Wireshark again.

1 byte for the protocol (06)

Translate to binary

0 0 0 0 0 1 1 0

Translate to decimal - positions 1, 2

2 + 4 = 6

Take a look at our nifty protocol chart:


Looks like we have TCP (#6).

Next 2 bytes (60 6a)

This is our checksum. Equipment uses this to verify nothing has inadvertently changed.

4 bytes  (5d b8 d7 c8)

Source address

We need to figure out if we have a Class A, B or C IP address to know which bytes refer to network and which bytes refer to host in the address.

A - one byte for network, three bytes for host
B - two bytes for network, two bytes for host
C - three bytes for network one byte for host

Look at first number to determine if class A, B or C:

1-127 = A
128-191 = B
192-223 = C

Each byte is part of address with dot (.) in between (dotted notation)

5d = 0 1 0 1 1 1 0 1 = positions = 0, 2, 3, 4, 6 = 1 + 4 + 8 + 16 + 64 = 93
b8 = 1 0 1 1 1 0 0 0 = positions = 3, 4, 5, 7 = 8 + 16 + 32 + 128 = 184
d7 = 1 1 0 1 0 1 1 1 = positions = 0, 1, 2, 4, 6, 7 = 1 + 2 + 4 + 16 + 64 + 128 = 215
c8 = 1 1 0 0 1 0 0 0 = positions = 3, 6, 7 = 8 + 64 + 128 = 200

So we have a class A address (93).

Address is

We can look that up at ARIN.net...but wait...it's a RIPE address?? Not sure why a computer on my network is connecting to a European address...but that's a topic for http://randominternet.blogspot.com

inetnum: -
netname:         EDGECAST-NETBLK-03
descr:           NETBLK-03-EU-93-184-212-0-22
country:         EU
admin-c:         DS7892-RIPE
tech-c:          DS7892-RIPE
status:          ASSIGNED PA
mnt-by:          MNT-EDGECAST
source:          RIPE # Filtered

4 bytes (0a 01 0a 13)

Destination address

Same concept as above.

Hexadecimal to Binary to Decimal - Cheat Sheet

I'm studying hexadecimal to decimal conversions for packet header analysis (IP, TCP, UDP, etc).

Trying to come up with a cheat sheet to make the whole thing easier to remember.

First of all each numbering system has a single character representing each possible single digit value. After these values are used up you start tacking these single digits together to come up with bigger values.

For example the single digit values for each of the following numbering systems are:

Binary = base 2 = 0, 1
Decimal = base 10 = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Hexadecimal = base 16 = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F

In the list of characters above for hexadecimal the letters are just a way to use a single character for a two digit decimal number. So our 16 base numbers are 0-15 and we use letters for 10-15 (which are 2 digit numbers) as follows:

A = 10
B = 11
C = 12
D = 13
E = 14
F = 15

So why the heck are we making this all complicated and using these crazy numbering schemes instead of the decimal numbering system we know and love? Computers need a way to store and represent numbers. They don't have fingers. They have circuits. (I'm probably way over-simplifying this - intentionally) These [Boolean] circuits can either be on or off. I like to think of it as a row of light switches. Flip some of them on, some of them off. On is represented as 1 and off is represented as 0.

So let's say you had a row of 4 light switches and starting from right to left, first is off, second is on, third is on, fourth is off. That would look like:

0 1 1 0

The light switches on or off allow you to represent a binary number. So what's that binary number in decimal? Binary is base 2. For each position that has a 1, we take the value of the position (starting with position 0) to the power of 2 and add up the results to get the decimal number. So we have in this case positions: 3, 2, 1, 0. Position 1 and position 2 have a value, position 0 and position 3 don't so:

(0) + (22) + (21) + (0) = 0 + 4 + 2 + 0 = 6

So 0 1 1 0 in binary (or circuits in a computer turned on and off) = 6 in decimal

So what's hexadecimal for anyway? It takes less space to represent a hexadecimal number where a single hexadecimal digit can represent four binary digits. In other words instead of representing 15 as 1 1 1 1 we can just use F. Hexadecimal is used instead of decimal because 10 is not a value that comes from 2x, so it's not easy to translate a series of 1's and 0's to base 10.

In computer terminology each single digit of storage (circuit on or off, i.e. 1 or 0) is a called a "bit". 8 bits = a "byte". 4 bits = half a byte or a"nibble". (har har)

One hexadecimal character is 4 bits (with a 1 or a 0 in each spot). If you think about it, it makes sense. Turn all four bits on (1) and calculate the decimal number:

1 1 1 1

or: (23 = 8) + (22 = 4) + (21 = 2) + (20 = 1) = 15 (counting from 0 to 15 = 16 digits).

We can turn that single four digit binary number into a 1 digit hexadecimal number and store 1 digit instead of 4

Ok now we want to take a hexadecimal digit and convert it to decimal. So let's take 6, for example.

We'll have four bits to represent 6.

_ _ _ _

OK so for each of those spots we have to either put in a 1 or a 0 as required to represent a 6. If each of those slots represents a binary value and if each spot were filled with a 1 we'd have these decimal values for each corresponding position (again 20, 21, etc.):

8 4 2 1

Ok so how do we come up with 6? 4 + 2. So the slots for 4 and 2 are set to 1 and the slot for 1 and 8 are set to 0. That gives us binary 6:

0 1 1 0

Let's try D. D in hexadecimal = 13 in decimal as shown above. We will need the slots to have 1 for positions 4, 3 and 1 (8 + 4 + 1) so binary digit D is represented as:

1 1 0 1

Now we can look at this another way to come up with our cheat sheet. We know the decimal value of each hexadecimal digit above. We can map out the binary to hex translation in a table like this:

Hex Binary
00 0 0 0
10 0 0 1
20 0 1 0
30 0 1 1
40 1 0 0
50 1 0 1
60 1 1 0
70 1 1 1
81 0 0 0
91 0 0 1
A (10)1 0 1 0
B (11)1 0 1 1
C (12)1 1 0 0
D (13)1 1 0 1
E (14)1 1 1 0
F (15)1 1 1 1

It will also be helpful to memorize or table-ize the values of each binary position for our translations from hex to decimal. For each position in a binary number there is a corresponding decimal number which is (2^[position]) or 2position. We know already that position 0 = 1 (20), position 1 = 2 (21), position 2 = 4 (22) and position 3 = 8 (23). Our full table for 16 positions could look like this where each subsequent value doubles the value in the prior position:


Ok now let's say we have some crazy looking Hexadecimal number that looks like this: 


First of all we know there are four bits for each hex digit:

_ _ _ _ | _ _ _ _ | _ _ _ _ | _ _ _ _

Now as above we know that for each four slots we'll have 8 6 4 2 as the decimal representation of 1 1 1 1. So let's translate that crazy hex number into binary one character at a time.

A = 10 as shown above and in our chart we see that is 1 0 1 0.
E = 14 and that is 1 1 1 0
0 = 0 and that is 0 0 0 0
6 = 6 and that is 0 1 1 0

Put that all together and what does it spell?! Ok I'll get out of cheerleader mode now.

1 0 1 0 1 1 1 0 0 0 0 0 0 1 1 0

What do we do with that?? Well we know that it's base 2 so for each digit we calculate 2position and add up the result. So we have a 1 in positions (starting with position 0 on the right): 1, 2, 9, 10, 11, 13, 15.

We can grab the decimal value for each of those binary digit positions from the above binary position to decimal table and add them up:

2 + 4 + 512 + 1024 + 2048 + 8192 + 32768 = 44550

We can check that calculation on the handy dandy Windows calculator. Open it up and choose "programmer" from the view menu. Click on the "hex" radio button. Enter AE06. Then click on the decimal radio button. Yay it worked! I'm typing this all up from scratch and I guess I got it figured out.

Hopefully having the translation cheat sheets above will help in a pinch, or can go the route of memorizing all of the above - kind of like my parents used to grill into me and their grade school students to learn their math facts :)

Related - Translating IP headers (and UDP, TCP, etc. not mentioned in the post below) from Hex to meaningful values humans can understand - I'm assuming here most people don't speak hex.


Key Management Systems & Cloud Encryption

Listening to SANS cryptography session. I've always wondered why there is so much focus on secure code for encryption but not a lot of discussion about key management. I've blogged and tweeted about this mystery in the past and caveat about the first link - I'm digging into cryptography a bit more right now and may be revising:




Data needs to be protected at rest, in transit and protect the key. If you fail at any one of these your encryption is useless. A lot of people understand the first two. The third is often overlooked and possibly most important, because a short key length still takes some time to hack, but keys out in the open mean you might as well hand the data to the adversary.

Oh and about the "well it's on our internal network behind the firewall" argument, I hope anyone involved in a corporation of any size that is utilizing pen testing services and/or has been breached understands by now this is a completely na├»ve viewpoint. I used to have to argue with network admins at managed data centers who didn't want to set my outbound traffic firewall rules. Now outbound traffic is one of the primary ways for determining if you're hacked since all malware calls home. APTs have attackers infiltrating systems throughout corporate networks. If they get access to your data you want to have the keys that decrypt that data in a separate place.

So now that we understand that even if our key is in our own house we need to separate it from the data and protect it from people who might get access to the encrypted data, how and where do we store and manage the keys?

We need to store them away from the data, make them accessible to the applications that need to decrypt the data, protect the data in transit and rest (and in memory for things like credit cards on POS machines...)

There are conceptual discussions of protecting the key, and I understand you should put your key on a separate server, away from your data. But what is the best way to actually implement this solution?  What about protecting keys in conjunction with using a cloud provider where you want to protect your keys and have them completely managed by someone other than the cloud provider so anyone that gets access to your encrypted data in the cloud cannot access the keys?

One of the quandaries I have with using these vendors is the Trojan horse concept. If you're keys are the absolute most critical thing you need to protect because they provide access to all your data, giving your keys to a third party system is a bit scary.

One thing would be to make sure very limited amount of people have access to the keys. Additionally through separation of duties you can make sure the same people who manage the keys do not have access to the encrypted data. But how do you actually implement that?

I remember from a business school class a company that outsourced production of certain products to China. Their approach was to have different companies to produce different parts with no single company having access to the complete formula. Perhaps such an approach would work with key management, similar to multi-factor authentication. In order to decrypt you need multiple pieces of information. Of course this adds overhead to processing and slows things down.

The other issue is actually limiting access. After my first graduate degree in software engineering in 2000 I went out to look at technology for a couple of venture capital firms. I didn't have much security knowledge at that point but I remember going to look at some technology from a company one of the VCs was considering for investment. A Russian guy proudly explained how they were able to bypass corporate firewalls by tunneling through SSL ports. This freaked me out somewhat. If these guys can tunnel through an SSL channel which is supposed to be for SSL and completely bypass the firewall, it made the firewall kind of useless. I didn't understand all the ins and outs of networks, ports, packets, etc. at that point but I thought: what's the point of blocking all these ports if someone can use any old port to get into your network?

So clearly blocking access to your keys by just adding firewall rules for ports is not enough. You'll need authentication to limiting access to only specific IP addresses, authorized users and servers and you have to make sure no one can get into the servers, IP addresses or insert a man in the middle attack because then they can tunnel through to your key management store.

Obviously a number of layers of security are needed at different levels to protect your keys and make sure the only applications that should access those keys can get to them to decrypt only the data to which they should have access. There are complications with unencrypted data in memory as well. Thinking about the way SSL VPNs work - they download a thin client and everything runs in an encrypted RAM drive. Maybe something like that could be used for corporate applications running in the cloud.

Going to great lengths to encrypt every piece of data and protect it in every possible way can get very expensive and slow down system performance. Perhaps a better approach is to limit the risk of exposure to a reasonable degree, and add additional layers of detecting any malicious or unauthorized activity. Detection in this day and age of APTs and complexity which can be mind boggling at times, I would argue, is more important than prevention and have mentioned that in my own company in terms of auditing financial systems for data errors. This viewpoint was confirmed in SANS 401 class I took which has the motto: "Prevention is Ideal, detection is a must". So perhaps you limit your exposure and add a lot of auditing and alerts for unexpected activity.

I'm currently listening to Diffe-Hellman key exchange from SANS 401 (sans.org) which operates under the concept of asymmetric encryption and being able to do key exchange in the presence of an adversary. Being able to utilize vendor systems that can provide amazing value in terms of innovation, reduced time to market, segregation, fault tolerance, scalabilty and performance - without - a capital expenditure while minimizing the risk of loss of intellectual property, NPI data and credit cards at the same time is an interesting problem.

For the moment I'll be going through this list of vendors that have key management systems (from Wikipedia) and reading a few books on the matter ... to be continued (I know the suspense is killing you).


Maybe we can get a panel from these vendors to do a presentation at an upcoming Seattle AWS Architects and Engineers meet up:


Gartner says Amazon has 5 times amount of compute power than next 18 cloud providers combined.

Pace of innovation increases when increase deployment iterations and reduce the risk.

AWS builds custom servers. Optimized performance and 30% off the cost compared to private cloud purchasing servers from vendors.

DynamoDB gives you NoSQL with consistent (as opposed to eventually consistent) reads.

Because Amazon has built so many data centers they are obtaining expertise and getting better at it.

The success of Amazon is based on a big way on a distibuted model of teams who manage their own technology [and interact through services based through a blog post I read].

Scaling SQL databases is easy - partition the database. The problem is repartitioning the data while taking on new traffic. Initially Amazon avoided by buying bigger boxes


Amazon wrote a paper on Amazon Dynamo, highly available key-value store.

Distributed hash table.

Trade-off consistency for availability.

Allowed scalability while taking live traffic.

Scaling was easier but still required developers to benchmark new boxes, install software, wear pagers, etc.

Was a library, not a service.


DynamoDB: a service.

- durability and scalability
- scale is handled (specify requests per second)
- easy to use
- low latency
- fault tolerant design (things fail - plan for it)

At Amazon when talk about durability and scalability always go after three points of failure for redundancy.

Quorum in distibuted systems

DynamoDB handles different scenarios of replica failures so developers can focus on the application.

SimpleDB has max 10 GB and customer has to manage their own permissions.

Design for minimal payload, maximum throughput.

Can run map reduce jobs through DynamoDB. EMR gives Hive on top of DynamoDB.

Many AWS videos and re:invent sessions on AWS web site.

HasOffers uses DynamoDB for tracking sessions, deduplication.

Session tracking is perfect for NoSQL because look everything up by a single key: session id.

Deduplication: event deduplication.

Fixing DynamoDB problems ... Double capacity, maybe twice, fix the problem, drop the capacity

Being asynchronous and using queues is nice option.

Relational databases are more flexible for querying. Something to consider when determining whether you want to use RDBMS or NoSQL.

Hash key is single key. Can also have combo key.

Hash key = distribution key

Optimal design = large number of unique hash keys + uniform distribution across hash keys.

Important to pick hash key with large cardinality.

Range key: composite primary key - 1:N relationships. Optional range condition. Like == < > >= <=

e.g.customer id is hash key and range key is photo id

Local secondary indexes. e.g. Two customers share a key. Requires more throughput capacity.

Hash + Range must be unique

Data types supported: string, number, binary and sets of the three.

Cannot add or change secondary indexes after initial creation of table...may be coming.

Global secondary indexes are separate tables asynchronously updated on your behalf. GSI lookup is eventually consistent. May require one or more updates.

Local secondary index = max 10 GB per hash key. May be a reason to move to GSI.

GSI has it's own provisioned reads and writes whereas LSI's use provisioned table reads and writes. 

1-1 relationship: hash key and secondary index

1-Many index: hash key and range key

NoSQL - no transaction support in DynamoDB

Can only double throughput when changing. Amazon looking at changing this.

Choosing the right data store:

SQL: structured data, complex queries, transactions.

NoSQL: unstructured data, easier scaling

DataPipeline automates moving between data stores.

A client only app is available which emulates DynamoDB to develop without paying AWS fees.

OSI and TCP Model - Network Layers

Studying for GAIC and just seeing if I can write these from memory.

We use the OSI model to talk about network layers and the TCP/IP model to implement.

OSI Model

(P)lease (D)o (N)ot (T)hrow (S)ausage (P)izza (A)way

Physical layer (layer 1) - transmission of raw binary data (0's and 1's). Typically via electrical (Ethernet), Radio Frequency (Wireless) or photo optics (Fiber).

Data Link Layer (layer 2) - Switches typically operate at this layer. This is the logical layer - where the data has meaning as opposed to raw binary data.

Network Layer (layer 3) - routing layer - where most routers operate and determine the path the data will take through the network. Some switches, referred to routing switches, operate at this layer.

Transport Layer (Layer 4)  - packages and orders data as it flows through the network.

Session Layer (Layer 5) - virtual connection between two points for transmission of data.

Presentation Layer (Layer 6) - transforms the data into machine independent data that can be read by any computer, whether big endian (left to right) or little endian (right to left).

Application Layer (Layer 7) - the layer that handles providing particular needed network services to the application (HTTP, FTP, etc.)

The TCP/IP Model

The TCP/IP Model has four layers but some layers are just a combination of the above layers. There are still 7 layers we just group them together in the TCP/IP model as follows:

Network - Layers 1 and 2
Internet - Layer 3
Transport - Layer 4
Application - Layer 5, 6, 7

Devices & Tools

NICs operate in Layer 1 and Layer 2, handling transmission of binary data via ethernet, token ring, wireless.

Sniffers operate at layer 2.

Switches natively operate at layer 2 though some have layer 3 routing capabilities and blade systems may allow for firewall.

Routers operate at layer 3. They use the IP to determine which network to go next but use ARP, routing tables and MAC addresses to get the packet from one hop to the next.

Firewalls operate at layer 3 or layer 4.


Not too shabby. Didn't have to look anything up :)

On to header analysis and protocols.


How To Get Code Deployed

Here are some ideas to help get code deployed at work if you are a developer or QA (and to remind myself):

* Decline extraneous meetings

* Make sure the business requirements are complete for the story you are working before you start.

* Focus on writing and testing code

* Ignore distractions and noise like complaints and process discussions that don't get software deployed.

* Avoid being a distraction by complaining or instigating dissent and conflict.

* Stop talking about all the work you aren't going to be able to get done. Sit down and do what you can.

* Lead by example. Do what's right for the company.

* If you have competing priorities, ask the people giving them to you to prioritize them and start from the top. Make it visible to all the priority list you have been given.

* Think about ways to get everyone working in parallel. Make sure you are not the bottleneck and blocking others from delivering value but rather contributing to them doing so successfully.

* Make new stories for scope creep. This is makes the new effort clear to business.

* Before nixing a deployment go over the pros and cons of releasing all or part of the work in a certain state with business and give them the option of releasing or not.

* Focus on must haves for the end result that delivers business value.

* Sit with people when it makes sense and fix problems together

* Collaborate to resolve problems. Focus on the resolution that delivers the most business value, rather than an idealistic scenario or the one that gets people to stop bringing it up if it is truly hurting the company.

* Call out impediments that can be fixed; deal with those that are part of the process creatively, if you can.

* Write unit tests when you can and while QA is testing; Don't be ivory tower about it and block the whole team.

* Have QA write automated tests while waiting for code drops so their testing is faster and repeatable.

* Find ways to seamlessly share data set up scripts for QA automation and developer unit tests to leverage people's work in more places and get more automated tests in place.

* Create system designs that are flexible to change.

* Design new systems that can run in parallel with old systems and can be turned on or off before or after deployment if something goes wrong.

* Structure code with appropriate interfaces and decoupled components so they can be independently unit tested and team members can work in parallel.

* Avoid monolithic designs with 5000 line stored procedures or classes that are a bottleneck to your project.

* Design the system so it can be deployed in pieces by thinking about how to break the system down into independent processes - which could ultimately become distributed services.

* Work towards a horizontally versus vertically scaling architecture, but make sure your management and visibility into all systems rolls up into a single source. Think correlation of logs and Splunk or a service for logging all actions by all systems.

* Break off most problematic bits of complex systems into decoupled, horizontally scaling component intstead of rewriting and deploying a whole new system in one shot.

* Automate everything. Make people's jobs easier and avoid mistakes.

* Avoid technology for technology's sake.

* Beware of making a new technology or set of libraries a dependency until you have done a proof of concept and completely vetted the technology to understand implications, limitations and security risks (published on security and other web sites). Do research. Test. Consider the cost of every project forced to use it going forward, including maintenance and production support. Consider the potential longevity of the technology vs. other options and industry trends so you can find people who want to work for you in the future.

* Be like Amazon. Use services for interfaces between systems.

* Measure and think about what will deliver the most profit to the bottom line. 

* And if you are like me and obsessive about delivering business value, and your schedule is flexible, adjust your schedule so you can work at least a couple hours when there are less people in the office and no meetings. 

* Working from home one day would be really nice but it would be helpful if the team could agree on which day because in office collaboration can be extremely helpful at times to move a project forward.

* Amazon tells employees to be self-critical. I like that. If you are wrong admit you made a mistake rather than obstinately protecting your ego and blocking delivery of maximum business value.

* Considerations for the process as a whole: http://websitenotebook.blogspot.com/2014/04/agile-and-scum-how-to-make-it-work.html

Agile and Scrum - How To Make It Work

Agile is about deploying software that delivers business value - faster.

The purpose is not to put points on stories or prioritize backlogs or look at burndowns. Those things support the goal. If they don't you should stop doing them.

The point of scrum is to get the work done that gets you a deliverable that has business value into production in the shortest window possible. (As opposed to a waterfall process that drags on for months before shipping, if anything gets shipped at all).

Your best measure of a scrum team is how often they deploy code, what the value of that code is to the business (ROI on projects) and how much it costs to maintain their software (defects, complexity, error handling). Determine how much time and money the business saves (or pays) after system improvements vs. how much it cost to get them. Savings could be in the form of meeting business regulations so you don't get fined, making people's jobs take 25 less steps, or making information more secure, accurate and timely to get and keep more customers, for example. If you're putting a new gadget on your web site, make sure it's something customers want more than anything else - that will bring you new customers and keep the ones you have. The great thing about web sites is you can test customer response to changes and see what kind of ROI you get. The value of a scrum team is delivering small pieces of functionality in a timely manner so you can measure the response for small changes as opposed to a monolithic project which may sink a lot of money and not deliver the expected returns.

Businesses are about delivering value and making money.

Ultimately this all flows to your business's profit and loss statement, so make sure your measurements are aligned proportionately with their impact to the bottom line.

Here's how to make agile or scrum work

(or whatever you want to call a technical process that delivers value)

...in my opinion and in the little utopia in my head... but also based on some relevant experience delivering business value. I helped a start up grow from $1,000 per month to a multi-million dollar company delivering customer focused software in frequent iterations in ways that minimized cost.

#1 Measure ROI. Determine which teams are producing the most deliverables with the least problems in regular intervals, as well as the value of the deliverables to the business bottom line. Observe. Emulate those teams. (But don't get in their way and distract them while you are doing it.)

#2 Stop scheduling meetings to talk about agile, process and scrum. Discuss blocking issues and share ideas in the two meetings within the scrum framework which are for this purpose - Retrospective and Sprint Planning. The point of agile is to get work done. If you are having lots of meetings to discuss process, agile and scrum outside of these meetings then my personal view is that your company is not agile or using scrum.

#3 Read the guide on scrum.org. Use what works and throw out the rest. Stop trying to add things that aren't in there unless a team asks for them and are proven to deploy more high quality software faster. I should note that the scrum guide changed from it's original form and seems to have been embellished with potentially extraneous meetings (and confirmed by someone who claims to have helped alter documentation at scrum.org). Focus on what gets software with high ROI shipped. Don't get hung up on symantecs. Also note that a lot of consultants are riding the scrum wave and not helping your ROI. A recent thread on the Seattle Java User Group had a lot of complaints about this. Scrum is simple and that is the point. Consultants make it more complicated to charge you more money so you end up with more meetings and nonsense in scrum meetings that detracts from getting deliverables shipped. My sole job as scrum master was canning meetings that were not productive. That is how I help my team stay on track - getting things out of their way they didn't want to do in the first place.

#4 Scrum is about supporting the technical team. If the technical team says something doesn't work for them stop trying to make them do it. Especially listen to the team members that are delivering the most value to the company, not so much the people who are complaining but not getting anything done -- or the people who say yes to everything and aren't contributing what they could be otherwise. They would probably be happy to contribute more if allowed. Stop pretending to be supporting the team with scrum masters - if the scrum masters are only creating disruptions for the team and inserting unneccessary overhead. Listen to the team instead and help when asked and truly needed rather than detracting from progress with extraneous commentary. 

#5 If the team is getting what they want and still not delivering software in a timely manner, either you have a dev lead who doesn't have the experience to design projects for agile deployments (who will learn with time), no leadership or some people blocking the team who are not team players or not good at agile and slowing things down. Perhaps people have personal agendas which are not aligned with business objectives or team success. Make sure you cut through the politics to figure out who is really delivering value. In other words the problem could be the people not the process. Make sure the team can move forward in a way that does not create undue business risk (test and audit everything, have solid team players that know what they are doing and can support those that don't but can learn) and remove impediments - which may mean removing people or adjusting teams to find a good mix of people who work well together and can deliver.

#6 Yes. You need leadership. The leader for implementation of scrum projects should be a technical person if you want to get things done in a timely manner and not have things blow up in production. But your leader can't be a person with a solely technical mindset either. You need to find leaders who understand the value the project is delivering and aligns the technical implementation and cost with the business need. How do you find these leaders? Measure. See #1. Resolve the people issues #5. Some people think there should be no leader and that undermines the person who is the lead. That impedes the overall potential success of the team because it leads to lack of vision and competing objectives. Give your team a leader and back that leader up. If the leader is a good leader, they will not dictate except where there is a potentially dangerous risk or cost or lost opportunity. At that point the leader will press to do what is right for the business and with support the business will win. You can measure your leaders the same way you measure your teams and deliverables - not based on a political campaign, but who delivers value that translates to the bottom line.

#7 Unless you do not care about the cost of the project or the risk to the business, I disagree with letting anyone on the team do whatever they want. If you don't care about the cost of the project or a flawed system go for it. The project can take an eternity if your budget is open ended. Everyone will do their best and you may have to change the code six times but whatever. It may cost ten times more to support it because of poor error handling - or worse. You could be losing money you don't know about. You'll have to catch that through auditing on the business side. So here is where I diverge from the scrum guide. I've had to work until 11:00 PM for weeks to basically re-write flawed software given to someone that didn't have the skills to write it (a prior job). A good dev lead will understand the capabilities of his or her team members and give them work to let them learn and grow - or partition it into pieces within a framework that gives people opportunity without creating undue risk; In other words, without creating a situation where the business has things blow up in production, do costly rewrites, or implement something flawed such as a financial system that doesn't reconcile or a multi-threaded application with a tricky random bug. There are ways to safely structure projects and code and then there's chaos. And yes everyone makes mistakes but keeping them to a manageable amount is better, right? There's also a way to focus on testing instead of dictating - set up tests to determine if each team member's code works or not before you put it in production, rather than dictate how they do things - but if it's a free-for-all those tests are disregarded and bypassed because people can just do whatever they want.

#8. If someone has a new idea try it. Measure it. Test it. With one team. That is open to the idea or is having problems. Don't schedule meetings and training with all teams across the whole company for an unproven concept. Don't force teams that don't want to do it participate if they are functional, happy teams. Every team is different. Let each team do what works as long as within company requirements and getting things done. Let the right team test it - one that can make sure the implementation is efficient and easy to use as possible - if you plan to force it on all teams.  

#9. Continually trying to "improve" processes that are not broken and "solve" problems we don't have adds stress for teams that are trying to focus on getting their work done. Leave functional teams alone. If there is a problem, prove it with measurements to show which teams are delivering the most value to the business - and take into consideration which teams get which projects and why (such as skill or politics). Just because a team is delivering a lot of boring projects doesn't mean they aren't producing a lot of business value. Consider that if you don't hear about a team they are probably doing good work and that's why. But check the stats.

#10. Stop scheduling meetings FOR teams. Teams should be able to tell you when they need a meeting whether for current or future work. Meetings to help non-technical people figure out what the team is doing are the worst. Stop creating separate meetings for projects and work shops and scrum training. There are a handful of meetings in scrum and the team should choose when to have them, not the project manager, backlog owner or scum master - depending on where they are in the process of getting deliverables completed for the current sprint. Because that is the point - delivering software. Let people get work done. Ask the team if they need or want the meeting before you schedule it if the business needs a meeting for some reason. If you have a team of yes-people, check in your scrum tool to see where deliverables for the current sprint are at before scheduling it. Meetings with too many or the wrong people are also a problem. Having business people in a technical design meeting might be fun for the business people but if they aren't needed it's better to have them off gathering much needed business requirements. Business requirement gathering meetings might be interesting to team members but they need to be getting software deployed. Bring everyone together on the business and technical side when there is a true need - such as an early demo of the software to discuss design and requirements where the two sides merge. The team will also let you know when they are ready for more work or stories - and that is a bigger problem - and my next point.

#11. Business people should be focusing on business requirements. Technical teams should focus on implementing them. Define the business rules (not how the screens look or technical implementation) before taking up the team's time. The technical team can't tell the backlog owner what the business rules are - like we only run this process on Tuesdays and it needs to happen before 5 p.m. The backlog owner, analyst, project manager, etc are not the ones who design and build the technical implementation. Stop scheduling meetings for new projects when the business hasn't defined the rules and compliance hasn't signed off on them. Stop writing stories with technical specifications and system design. Do provide the team with the current end to end business process as it exists today before you start showing us the changes - and the changes as business requirements should be to process, not systems. The requirements are from the point of view of a non-technical person - what they see and interact with to support their work flow.  Consider User Experience, not just User Interface. At first I was skeptical of UX thinking it was just another buzzword, but now I am fan. Consider a person trying to transfer money into a brokerage account. They enter their information on a web site. Done right? Not even close. The experience is from the point of the person entering the data to the point it all assets - which may trickle in at different times - and cost basis which may also come later, shows up in the customer's account. That involves the experience of the people that support all of that behind the scenes in both customer service, business operations and possibly third party vendors and contra brokers. And by the way don't forget ALL the scenarios, like the one where the request to transfer goes off to the contra broker and into the ether and never gets completed. If you don't understand scenarios and use cases you are making projects cost more to the business. There are many articles online about these things. The acceptance criteria is what the system users will be able to do (not how or in which system). Think about the value to the user and the business. Define the business process in non-technical terms. The technical team will translate that into a system to support the process.

#12. Use tools appropriately to keep team focused on goal. Some people say people over tools. However if you have lightweight tools it's easy to get a grasp of where a project is at without disrupting the team with extraneous meeting time. Version One is easy to use -- If you keep it simple. People try to over complicate this as well. I'm not a fan of getting everyone in a room to write on pieces of paper or talking about work in general, non-quantifiable terms. And PLEASE don't make us play games and color with markers or do other such goofy things. Project managers like this kind of thing. Technical people hate it. (At least the technical people on my team do.)

Here's how we use Version One:

Two days before upcoming sprint (no meeting!) Dev and QA lead talk team members and divvy out stories. That takes 5-10 minutes. Each person tasks out their stories - at their desk. Assignments can change but it's a starting point. Technical people can equate this to a multi-threaded or distributed process getting things done in parallel vs. a single threaded bottleneck with 10 people in a meeting room. Which will get the job done faster?

On day before or first day of sprint we have a one hour Sprint Planning meeting to review the plan. People can suggest additional tasks and they are added on the fly. Stories can be reassigned or removed from sprint depending on capacity. The other benefit of this is it allows people to have time to think about the tasks between the entering and review. We also hijacked a non-existent field for drop date because QA needs enough time before end of sprint to test if we are going to commit. We fill in release dates if not present for things worked this sprint. There is no need to discuss any of this planning for the upcoming or future sprints in the middle of a sprint unless you are pointing stories (i.e. getting work done). It is a distraction and hurts the team's ability to meet sprint commitments. And it is not scrum. Planning happens in Sprint Planning and Retrospective is for discussing and adjusting your process if needed. I've noticed another long term meeting for planning was added to the scrum guide - if you must have that meeting do it in conjunction with Sprint Planning and don't disrupt the team in the middle of sprint when focused on completing deliverables and put deployment items at risk.

Throughout the sprint the team updates Version One. Story and task status (in progress, done, etc) and hours left. The team also enters release dates and can enter dependencies once they are deep enough into the code to have a technical deployment strategy. Questions about what is in Version One can be asked in stand up if not too disruptive, or addressed to the person the question is for at his or her desk rather than taking up the entire team's time.

Poker Pointing occurs throughout the sprint and we plug story points into Version One - but with the following general rules:

- In my utopia, as has been the case for some projects, we do all our estimating in poker pointing, not in multiple separate meetings for the same purpose. Do it once - it's just a guess anyway and your estimates way in advance will be completely wrong because typically the business hasn't figured out exactly what they want or gotten sign off from all the appropriate people (which then leads to re-estimating and worse if code is started - rewriting).

- Sometimes we revisit and re-point stories if they have changed significantly.

- QA is typically blocked waiting for Dev drops in the first half of a four week sprint so poker pointing is best in the second week of the sprint if things are going well, or possibly third if not so well. Perhaps you wait until everything made it through RC because QA is also slammed in the third week and your deployment is at risk. The team may choose to have a few long poker pointing sessions at the end instead - especially if you have stories in a good state and/or enough points for the upcoming sprint. (Refer to the statement at the top of the article.)

- If your team is slammed consider if you have enough points for the next sprint, in which case you might not want to disrupt and cause the current deployment slide for the sake of poker pointing. (Again, refer to the goal at the top of the article).

- We try to make sure the stories are ready to point before poker pointing and that we have enough of them. Because I currently work with the world's greatest analyst this is generally not a problem.  All team members can review stories before poker pointing if you have issues with this or just your leads to make sure stories are ready before dragging everyone into a room and disrupting their work.

Project managers and backlog owners can look in Version One any time to see the status of the project - instead of scheduling a meeting and disrupting the team. More time can be spent doing work instead of talking about work. If the business people want to know about specific stories or blocking issues, come to stand up - don't schedule yet another meeting. Additionally all the information in Version One can roll up into reports for business showing where projects and teams are at and upcoming work. Stop creating separate spreadsheets, reports, emails and busy work. Create automated reports from Version One that gives business the decision points they need to run the business cost-effectively and profitably with minimal work for the team and less disruptive questions to teams trying to get work done. See #13.

#13. Measure. This is so important I will say it twice. Looking at Version One but more importantly what gets delivered each sprint is far more accurate than scheduling a meeting without looking at the facts and hearing the old "we are 90% complete" line. If the team says they will meet commitments and there are 20 hours left in the sprint but 60 hours left in Version One to do you can avoid a meeting which is only going to make it worse, and address the issue in retrospective and Sprint Planning. Measure production incidents, business value in terms of cost savings, increased income and profitability, new customers, and whether more work is getting done with new systems (as opposed to making things more cumbersome). Some systems create more work but deliver value by increasing assets, revenue, or reducing risk. You need someone who can measure appropriately - not just throw out numbers

#14. The project needs to end. Agile doesn't lend itself to this concept very well without proper planning. This can be a downfall for agile when people don't understand how to make it work. You can think of agile projects as time-boxed. You have a list of things to get done and a budget. You'll get as many things on the list done as possible within that budget. The more up front planning you do and the more time you give your team to focus on deliverables, the more items in the list will get done. You don't have to have every screen designed at the start if the design is flexible, but you need your high priority business rules and processes documented and signed off on by decision makers and compliance before you start. New requirements = increased scope and cost in most cases. Incomplete decisions = project delays and increased cost. You need someone on the team that understands how to prioritize deployments to deliver business value early on and in a manner in which the project can be cut off at any point and not be a total loss. If you have high priority deliverables make sure they get deployed early and leave the nice to haves at the end. The other nice thing about deploying possibly incomplete but functional initial deployments is the ability to minimize risk (run old and new components in parallel or release non customer facing side first) and to get feedback before the targeted end of the project so you have time for changes within the project window.

#15. Remember why we have jobs. Ultimately the measurement of our work boils down to financial statements - profit and loss. If the business is doing well, we keep our jobs. People who drag entire teams into meetings to talk about work instead of doing work aren't focused on improving the bottom line. People who are constantly hopping on the latest buzzword bandwagon but don't understand the cost and value to business aren't really going to help your company in the long run. People on teams are happiest when they are productive and add can value to their full potential. Some have more potential than others perhaps, but everyone can be an important contributor to the overall workings of a company if they focus on the right things. They want to help your business grow. Let them. It will help the bottom line and lead to a more successful company - if you have the right people. The right people understand why we have jobs - because we are paid by a business that needs to make money and be profitable to do so. So I'm back where I started.

Agile is about deploying software that delivers business value - faster.

So get out of that meeting, walk to someone's desk to get your question answered, and get to work!

Java 8 Crash Course Notes

A summary of new things in Java 8. 

This is a little rough. Check the rest of the links on my Twitter account for more in depth articles.

() ->

Method reference
Allows retrofitting existing method in as a lambda argument.

- allows adding default methods to interface
- functional interfaces

Types of lambdas:


- performs function on argument passed into it

- provide things

New forEach:

Collections now have forEach method

      if(ship .... Etc


streams / parallelStreams
- have to synchronize collection to use parallelStream - old way is faster - if done incorrectly.  Correct way is to use new Collectors object. e.g. Collectors.toList(), Collectors.toSet()

Variables within a function are thread safe - supports parallel execution. Supports distribution of load across multiple cores.

JVM in Java 8 can run JavaScript. Can execute JavaScript from command line http://www.takipiblog.com/2014/02/10/java-8-compiling-lambda-expressions-in-the-new-nashorn-js-engine/

Nashorn to replace Rhino

New Java Time package.


Human readable time:

c = Clock.systemUTC()
Time Gap

- optional gets you away from nulls
- deals with nullable return values

IDE Support
- NetBeans
- IntelliJ

Code coverage:
Most tools working
FindBugs fails

Brother Printer Won't Print - Wireless Network

For future reference, because I always forget, if you are ever trying to figure out the IP address of your brother printer if and when it changes and your machines won't connect here's how:

Go to your printer and choose MENU- PRINT REPORTS - NETWORK CONFIG


Get the IP address, go to control panel in Windows, edit the printer properties and you can hard code the correct IP address in the TCP/IP port assigned to the printer.

What is also curious when I print this report are all the different protocols that are enabled. I had no idea. Might be a good idea to turn off the ones you don't need:

NetBIOS/IP (Most likely want to turn this off if possible)
Remote Set Up
Raw Port
Web Services
PC fax
Network scan

Yeah... that's a lot of stuff not sure is all necessary in most cases. Would rather have most turned off by default. Will have to play around with turning on and off and see what happens.

Crypto - Coursera classes

To watch...in my spare time in addition to studying SANS material for grad school program in information security engineering. No rest...


C# + ASPX Web App - 1 week of lunch hours

I spent my lunch hours over the past week helping a friend and co-worker build a one page web application to track software deployments. She knew some C# but not how to build a web site. 

It's been ages since I built .NET web sites. I once built a .NET web site through my business ultimately for State of Washington Department of Printing. That was probably most extensive. It integrated with a Java library (itext) over RPC to generate proof of envelopes and letterhead - generating the bar codes was cool.

But I digress. The rest was C#. Login, product catalog, image management, workflow to create and save new types of envelopes and letterhead and send through multiple levels of approval. Once approved the item was available in the product catalog for diffrent state agencies to order. The site also facilitated these orders.

Then I moved to Java and more customized stuff.

But here are some notes from what I forgot and remembered this week:

1. To generate the code behind a control, create an ASPX control in the HTML. Then you can use intellisense or whatever Microsoft calls to by typing an attribute stating with "On" to see all your event handlers. When you choose one then type = and a name of C# function in double quotes. Visual Studio will ask if you want to generate the code behind function. You can say yes and you'll then find it has been created when you switch from ASPX to the c# code behind view.

2. ASPX tags force you to set the runat attribute. Which is funny because you always have to set the value to "server". I bet it's legacy. 

3. You can create an ASPX tag for a data source. Then you can assign that data source to other ASPX tags. This is good for beginners - I started out trying to do the connection in the code behind. I wonder how they handle connection pooling in this case but studying too many other things to go find out.

4. To get your page to show all the data you hooked up to it you can put Page.DataBind in the code behind. You can also do that for specific controls like DataGrids.

5. To select a row in a GridView you can add a button for that purpose - see online examples don't have time to find right now. Then the selected item events are triggered when that button is clicked.

6. Why is populating a drop down list so convoluted - in and language. I was right about the selected item, index etc. Depends what you are trying to set the time to but yes is selected index = something.

7. You can reference the GET and POST method using Request object as you would expect in the code behind.

8. Microsoft automagically gives you handles to all your ASPX tag components in the code behind so if you have and ASPX text box named George you can set the value of it by saying George.Text="Code Monkey"; in the code behind with no other code needed to reference the object.

This was a simple one pager and a great refresher. The principles probably get you pretty far for simple in house apps.

AWS S3 + Encryption: Protecting the Key

I'm playing around with storing encrypted files on AWS S3. In theory if you encrypt the data yourself before you put it on S3 it doesn't matter who accesses it. They won't be able to read it. This makes some assumptions about cryptography. If you don't buy into the axiom that proper encryption protects your data, then really it isn't safe anywhere except on a box that never is connected to any network ultimately accessible from the Internet. I am not sure how useful that would be in most cases.

As for not putting data on shared infrastructure I find this argument interesting because most companies send data across the Internet over shared infrastructure such as Frame Relay or MPLS. All their data is flowing over equipment shared with the whole Internet. It's encrypted.

Companies can classify their data to determine what they feel comfortable putting into the cloud. Once you know what you want to put there, bottom line is you need to encrypt in transit and at rest. 

There are things that make running applications in the cloud trickier than data storage. In memory data is a challenge (think Target) and there are legal forensic issues but cloud providers can address these upon request. This is a topic I have researched as part of  SANS Master of Information Security Engineering program - still inquiring about the details with vendors and research is still a work in progress.

CipherCloud has a solution that makes sense but I am still researching where and how keys are stored. There are also legal issues when it comes to where the data is stored, as noted in this article:

Some issues have also been raised about CipherCloud encryption techniques used in this blog post. I can't speak to the accuracy of this because I haven't used the product but I am aware that encrypting data points in such a way the semantics can help decipher the content is a problem. 

According to this article some improvements were made, but again cannot speak to whether the problems were all solved.


But then encrypting your data is probably better than no encryption at all - unless it was a scenario where you are letting a Trojan horse into your environment. I'm sure whomever is using these solutions has considered all of that. I think what I am trying to do is much more simple.

If you are storing files encrypted in entirety before they hit the AWS network and not giving any third party your keys, this seems like a good candidate for a trial application on AWS.

Of course you'll also need to consider compliance issues. AWS has the greatest number of compliance certifications so likely they can help meet that requirement in a more cost effective manner.

As I posted on Twitter a while back (I thought it up all on my own but since have heard others say similar things):

Encrypting data and storing the key in plain sight on the same box is like locking your front door and hanging the key on the door knob.

Found this article on general rules for crytopgraphy and key management:

Cryptographic storage cheat sheet.

Exploring latest research (and studying material from my last SANS course on cryptography) and will add more later.

Estimating AWS Costs Using AWS Calculator

Tips for estimating AWS costs

Amazon Calculator:
Amazon Calculator

Gather requirements
Map requirements to services
Right-size service choices
Evaluate pricing-model options
Use the Simple Monthly Calculator
Deliver *estimate* (educated guess)

Gather requirements
- Hardest part
- Location
- Duration
- Availability
- Operating system
- Prioritize Compute and Storage

EC2 will be the majority of the cost

Choose Instance type
- Start with memory requirements and architecture type (32 / 64 bit)
- then choose virtual cores required
- then select between alternatives based on use-case

Reserved instances may be cost effective but pay for the life of the instance even if you cancel it.
- if rates go down you'll get the lower rate going forward, but no refund on prior usage

Spot instances are good for map reduce jobs - bid for an instance, pay a lower cost, but instances can be terminated at any time.

Look for peak IOPS storage requirements for various components of solution and size EBS pIOPS and choose EBS-Optimized instances accordingly. Look at logs for SAN, database usage, etc to determine usage.

- based on specific work load
- database will have different IOPS requirements at different times

Pick the right region -- prices vary.

Detailed monitoring is every 1 minute instead of every 5 minutes. Costs more.

Data transfer in is always free. Data transfer out is not. May be able to estimate data transfer out looking at firewalls.

Storage on instance is ephemeral - will go away with your instance. Storage on EBS volume will remain if stop instance so you'll want EBS storage in addition to instance storage for most applications.

When selecting an instance take a look at dives - SSD vs other and IO - High, Medium, etc.

Competitive Analysis - AWS

AWS rated highest in Gartner magic quadrant in both ability to deliver and completeness of vision.

This analysis includes Google, Azure, RackSpace, SoftLayer, vCHS and is based on a third party presentation. I cannot speak authoritatively on the accuracy of the statements below.

IAAS vs PAAS - latter has some complications in the realm of gathering forensic data.

AWS has been around 8 years - other services much less.


Amazon compared to Microsoft (Azure), RackSpace, vCHS, Google, SoftLayer:

AWS - Broadest Global footprint

AWS - More scalable than RackSpace, SoftLayer, vCHS

AWS has most security certifications.

AWS offers the most Instance types for specific app needs: high IO, memory, compute power, etc.

Pricing model - no sub hour but spot, reserved instances.

AWS does not have live migration - Google, vCHS do if you feel that is important.

DevOps -yes - google, Azure, SoftLayer - no

AWS offers hosted desktop - workspaces.

Hadoop - AWS has EMR (elastic map reduce). Azure offers HD Insight.

Benchmarks need to choose apples to apples instance types. Some benchmark papers aren't highest quality.

Pricing: most competitors are more except Google which has lower price in some areas.

Edit 3/28/2014 Scratch that. Google cut their prices in 1/3. Amazon turned around and dropped some of their per hour pricing below Google's. However Google still offers per minute pricing and a cap if instances are on all the time which Amazon's pricing model cannot quickly match. But we'll have to keep an eye on that based on discussion I had with someone today - so that could change by the time you read this. Here's the pricing at this time:


Security is the number one concern companies have when considering using the cloud. Amazon is the largest online retailer and understands securing financial transactions.

Ability to innovate: 
Try out scripting and spinning up environments on each platform.

Security is the number one barrier to entry for enterprises.
Compare security processes.