Newer Post
Older Post
Slider(Newer 20)    Home (Archives)    Random Post    Slider(Random 20)    Slider(Older 20)

Monday, October 5, 2015

UUID Misc



https://segment.com/blog/a-brief-history-of-the-uuid/KSUID is an abbreviation for K-Sortable Unique IDentifier. It combines the simplicity and security of UUID Version 4 with the lexicographic k-ordering properties of Flake
KSUIDs are larger than UUIDs and Flake IDs, weighing in at 160 bits. They consist of a 32-bit timestamp and a 128-bit randomly generated payload. The uniqueness property does not depend on any host-identifiable information or the wall clock. Instead it depends on the improbability of random collisions in such a large number space, just like UUID Version 4. To reduce implementation complexity, the 122-bits of UUID Version 4 are rounded up to 128-bits, making it 64-times more collision resistant as a bonus, even when the additional 32-bit timestamp is not taken into account.
https://en.wikipedia.org/wiki/Universally_unique_identifier
The "nil" UUID, a special case, is the UUID, 00000000-0000-0000-0000-000000000000; that is, all bits set to zero
Version 1 (date-time and MAC address)
Version 2 (date-time and MAC Address, DCE security version)

Version 3 and 5 UUIDs are generated by hashing a namespace identifier and name. Version 3 uses MD5 as the hashing algorithm, and version 5 uses SHA1

The namespace identifier is itself a UUID. The specification provides UUIDs to represent the namespaces for URLs, fully qualified domain names, object identifiers, and X.500 distinguished names; but any desired UUID may be used as a namespace designator.
Version 3 and 5 UUIDs have the property that the same namespace and name will map to the same UUID. However, neither the namespace nor name can be determined from the UUID, given the other, except by brute-force search.

A version 4 UUID is randomly generated. As in other UUIDs, four bits are used to indicate version 4, and 2 or 3 bits to indicate the variant (10 or 110 for variants 1 and 2, respectively). Thus, for variant 1 (that is, most UUIDs) a random version 4 UUID will have 6 predetermined variant and version bits, leaving 122 bits for the randomly-generated part, for a total of 2122, or 5.3x1036(5.3 undecillion) possible version 4 variant 1 UUIDs.

You can use UUID this way to get always the same UUID for your input String:
 String aString="JUST_A_TEST_STRING";
 String result = UUID.nameUUIDFromBytes(aString.getBytes()).toString();
How to generate time based UUIDs?
The version 4 UUID is UUID based on random bytes. We fill the 128-bits with random bits (6 of the bits are correspondingly set to flag the version and variant of the UUID). No special configuration or implementation decisions are required to generate version 4 UUID's.
The version 1 UUID is a combination of node identifier (MAC address), timestamp and a random seed. The version one generator uses the commons-discovery package to determine the implementation. The implementations are specified by system properties.

https://github.com/cowtowncoder/java-uuid-generator
JDK's java.util.UUID has flawed implementation of compareTo(), which uses naive comparison of 64-bit values. This does NOT work as expected, given that underlying content is for all purposes unsigned. For example two UUIDs:
7f905a0b-bb6e-11e3-9e8f-000000000000
8028f08c-bb6e-11e3-9e8f-000000000000
would be ordered with second one first, due to sign extension (second value is considered to be negative, and hence "smaller").
Because of this, you should always use external comparator, such ascom.fasterxml.uuid.UUIDComparator, which implements expected sorting order that is simple unsigned sorting, which is also same as lexicographic (alphabetic) sorting of UUIDs (when assuming uniform capitalization).

https://en.wikipedia.org/wiki/Universally_unique_identifier
The challenge with a UUID is to make it be unique for multiple processes running on a single machine and multiple threads running in a single process. The Type 1 UUID as specified above does neither. On a fast machine with multiple cores it is quite possible to have a UUID generated with the same time value. This can be remedied only if the sequence number can span threads and processes, something that is quite challenging to do efficiently.
The Time Based UUID referenced compensates for these issues by:
  1. Only using the normal millisecond granularity returned by System.currentTimeMillis() and adjusting it to pretend to contain 100 ns counts.
  2. Incrementing the time by 1 (in a non-threadsafe manner) whenever a duplicate time value is encountered.
  3. Using a pseudo-random number associated with the UUID Class for the sequence number.
https://raw.githubusercontent.com/cowtowncoder/java-uuid-generator/3.0/release-notes/FAQ
Libraries
http://johannburkard.de/software/uuid/
UUID generates version 1 UUIDs that contain the the MAC address of a network card. 
https://wiki.apache.org/cassandra/TimeBaseUUIDNotes
https://github.com/cowtowncoder/java-uuid-generator
https://wiki.apache.org/cassandra/FAQ#working_with_timeuuid_in_java

http://www.java2s.com/Code/Java/Development-Class/YourownUUID.htm
timeseries in Cassandra and DynamoDB
The partition key + range key needs to be unique.  For my timeseries data it seemed pretty obvious that my range key needed to be the timestamp, and the partition key could be anything else about the event.  But how do I make this unique?  Theoretically an API consumer could issue two identical requests at once, so I might not have unique events.
Cassandra came to the rescue here.  Cassandra has a timeuuid data type. This is a type 1 (time-based) UUID with special ordering behaviour. Cassandra takes your timestamp plus some magic to make it unique and saves that.  When asked to order by one of these fields it looks first at the parts of the UUID which contain the time so the events are ordered as you expect.  It also has functions to determine the greatest and least possible timeuuid values for a given time, so you can use these to build a query that contains a time range.
you should include some time element in your partition key.  I started by using the date: my partition key is now the application ID plus the date, and the range key is still the full (unique-ified) timestamp. 

The solution I came up with was to include a sharding factor in the partition key.  I also changed the date part to be just the year and month, not the day.  So my partition key is now Application ID + year + month + sharding factor.  The sharding factor is simply a counter, modulo an application dependent sharding modulus. 
http://javadox.com/com.netflix.astyanax/astyanax-core/1.56.37/com/netflix/astyanax/util/TimeUUIDUtils.html

http://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/
The universally unique identifier, or UUID, was designed to provide a consistent format for any ID we use for our data. Another problem UUIDs were here to solve, was to not give a potential adversary any information about the data it represented.

This is a tradeoff between uniqueness and randomness that is represented by v1 and v4 of UUID generators.

V1 : Uniqueness

UUID v1 is generated by using a combination the host computers MAC address and the current date and time. In addition to this, it also introduces another random component just to be sure of its uniqueness.
This means you are guaranteed to get a completely unique ID, unless you generate it from the same computer, and at the exact same time. In that case, the chance of collision changes from impossible to very very small because of the random bits.
This guaranteed uniqueness comes at the cost of anonymity. Because UUID v1 takes the time and your MAC address into consideration, this also means that someone could potentially identify the time and place(i.e. computer) of creation. Try regenerating the UUIDs above, and you will see that some part of the UUID v1 is constant.

V4 : Randomness

The generation of a v4 UUID is much simpler to comprehend. Each and every bit of a UUID v4 is generated randomly and with no inherent logic. It’s that simple. There is, therefore, no question of anonymity.
However, there is now a chance that a UUID could be duplicated. The question is, do you need to worry about it?
The short answer is no. With the sheer number of possible combinations (2^128), it would be almost impossible to generate a duplicate unless you are generating trillions of IDs every second, for many years. This is a laughable standard for any application in todays world, and not substantial enough to take into consideration.

If you actually want your UUID to give some indication of the date and computer in which it was created, then UUID v1 may be for you.
https://stackoverflow.com/questions/2201558/sequential-guid-in-java
For sequential UUIDs, you are looking for a version 1 UUID. Java UUID Generator project seems to work quite well and is pretty easy to use:
Generators.timeBasedGenerator().generate().toString()
Posted by Jeffery at 7:39 PM
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Newer Post Older Post
Slider(Newer 20)    Home (Archives)    Random Post    Slider(Random 20)    Slider(Older 20)

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts

  • Archives
  • Design a chess game using OO principles | Runhe Tian Coding Practice
    http://k2code.blogspot.com/2014/03/design-chess-game-using-oo-principles.html http://swcodes.blogspot.in/2012/09/chess-game-design.html ...
  • Implement a jigsaw puzzle ~ KodeKnight
    Implement a jigsaw puzzle ~ KodeKnight Implement a jigsaw puzzle. Design the data structures and explain an algorithm to solve the puzzle....
  • thought-works: Object Oriented design for Elevator in a multi-storied apartment
    thought-works: Object Oriented design for Elevator in a multi-storied apartment A typical lift has buttons(Elevator buttons) inside the ca...
  • How to design a tiny URL or URL shortener?
    Related: http://massivetechinterview.blogspot.com/2015/06/n00tc0d3r.html https://puncsky.com/hacking-the-software-engineer-interview#desi...
  • Snake Game Design
    2. 设计贪吃蛇 怎么定义蛇, 怎么移动, 怎么吃, 怎么判断时候活着, 怎么定义游戏版 Design a snake game function played in nokia mobiles Different score strategies https://...
  • SSTable + LSM tree - Cassandra + LevelDB
    https://github.com/facebook/rocksdb/wiki/MemTable MemTable is an in-memory data-structure holding data before they are flushed to SST fi...
  • Improving Your Chrome Browsing Productivity with Vimium
    https://medium.com/@kevinpmcc/ditching-the-mouse-getting-started-with-vimium-chrome-extension-f45c61d6bc53 https://medium.com/@xnicox/10x-y...
  • Scalability for Dummies: Le Cloud Blog
    Scalability for Dummies: Le Cloud Blog Scalability for Dummies - Part 1: Clones Part 1 - Clones Public servers of a scalable web servi...
  • Finding a needle in Haystack: Facebook’s photo storage
    https://code.facebook.com/posts/685565858139515/needle-in-a-haystack-efficient-storage-of-billions-of-photos/ The Photos application is o...

Popular Posts

  • Archives
  • Design a chess game using OO principles | Runhe Tian Coding Practice
  • thought-works: Object Oriented design for Elevator in a multi-storied apartment
  • SSTable + LSM tree - Cassandra + LevelDB
  • How to design a tiny URL or URL shortener?
  • Finding a needle in Haystack: Facebook’s photo storage
  • MQTT - Facebook Messenger
  • Snake Game Design
  • Implement a jigsaw puzzle ~ KodeKnight
  • Readability内容分析算法,和它的那些多语言实现

Labels

  • Review (572)
  • System Design (334)
  • System Design - Review (198)
  • Java (189)
  • Coding (75)
  • Interview-System Design (65)
  • Interview (63)
  • Book Notes (59)
  • Coding - Review (59)
  • to-do (45)
  • Linux (43)
  • Knowledge (39)
  • Interview-Java (35)
  • Knowledge - Review (32)
  • Database (31)
  • Design Patterns (31)
  • Big Data (29)
  • Product Architecture (28)
  • MultiThread (27)
  • Soft Skills (27)
  • Concurrency (26)
  • Cracking Code Interview (26)
  • Miscs (25)
  • Distributed (24)
  • OOD Design (24)
  • Google (23)
  • Career (22)
  • Interview - Review (21)
  • Java - Code (21)
  • Operating System (21)
  • Interview Q&A (20)
  • System Design - Practice (20)
  • Tips (19)
  • Algorithm (17)
  • Company - Facebook (17)
  • Security (17)
  • How to Ace Interview (16)
  • Brain Teaser (14)
  • Linux - Shell (14)
  • Redis (14)
  • Testing (14)
  • Tools (14)
  • Code Quality (13)
  • Search (13)
  • Spark (13)
  • Spring (13)
  • Company - LinkedIn (12)
  • How to (12)
  • Interview-Database (12)
  • Interview-Operating System (12)
  • Solr (12)
  • Architecture Principles (11)
  • Resource (10)

Search This Blog

Blog Archive

  • ►  2021 (6)
    • ►  February (2)
    • ►  January (4)
  • ►  2020 (27)
    • ►  December (1)
    • ►  August (1)
    • ►  July (2)
    • ►  April (1)
    • ►  February (3)
    • ►  January (19)
  • ►  2019 (121)
    • ►  December (3)
    • ►  November (1)
    • ►  September (3)
    • ►  August (7)
    • ►  July (36)
    • ►  June (15)
    • ►  May (8)
    • ►  April (28)
    • ►  March (19)
    • ►  January (1)
  • ►  2018 (116)
    • ►  December (1)
    • ►  November (7)
    • ►  October (14)
    • ►  September (11)
    • ►  August (2)
    • ►  July (3)
    • ►  June (6)
    • ►  May (1)
    • ►  April (14)
    • ►  March (19)
    • ►  February (28)
    • ►  January (10)
  • ►  2017 (65)
    • ►  December (4)
    • ►  November (3)
    • ►  October (5)
    • ►  September (4)
    • ►  August (9)
    • ►  July (10)
    • ►  June (3)
    • ►  May (4)
    • ►  April (12)
    • ►  March (4)
    • ►  February (1)
    • ►  January (6)
  • ►  2016 (342)
    • ►  December (10)
    • ►  November (6)
    • ►  October (13)
    • ►  September (6)
    • ►  August (13)
    • ►  July (12)
    • ►  June (8)
    • ►  May (28)
    • ►  April (21)
    • ►  March (56)
    • ►  February (60)
    • ►  January (109)
  • ▼  2015 (724)
    • ►  December (144)
    • ►  November (86)
    • ▼  October (135)
      • Playlist Shuffle Algorithm
      • Java's serialization algorithm
      • Interview Misc
      • QR Code
      • Interview - Brain Teasers智力题
      • Lambda Architecture
      • Apache Storm Architecture
      • how to solve it - G. Polya
      • Java initialization order
      • Why Do Initializers Run In The Opposite Order As C...
      • how to solve algorithm problems
      • Cassandra
      • Design a Cache
      • System Design Misc
      • Docker
      • Message Brokers
      • Sping Source Code
      • Hadoop Big Data Ecosystem
      • [CareerCup] 10.1 Client-facing Service 面向客户服务器 - G...
      • [CareerCup] 10.5 Web Crawler 网络爬虫 - Grandyang - 博客园
      • [CareerCup] 10.6 Find Duplicate URLs 找重复的URL链接 - G...
      • [CareerCup] 10.7 Simplified Search Engine 简单的搜索引擎 ...
      • [CareerCup] 12.1 Find Mistakes 找程序错误 - Grandyang -...
      • [CareerCup] 12.2 Find the Reason of Crash 找到程序崩溃的原...
      • [CareerCup] 12.3 Test Move Method in a Chess Game ...
      • [CareerCup] 12.5 Test a Pen 测试一支笔 - Grandyang - 博客园
      • [CareerCup] 12.6 Test an ATM 测试一个自动取款机 - Grandyang...
      • [CareerCup] 12.4 Test a Webpage 测试一个网页 - Grandyang...
      • Linux Shell Scripting Misc
      • How to Be a Better Programmer
      • check if the stack grows up or down
      • Linux Misc
      • Principles of Distributed Computing
      • Lessons Learned From Reading Postmortems
      • The Chubby lock service for loosely-coupled distri...
      • Distributed Systems - 走向分布式
      • Distributed Systems: An Algorithmic Approach
      • What to Learn
      • Dapper, a Large-Scale Distributed Systems Tracing ...
      • Notes on Distributed Systems for Young Bloods
      • Distributed systems for fun and profit
      • Distributed Systems Theory
      • Java SimpleDateFormat + Commons FastDateFormat
      • Java IO Utils
      • JVM ShutdownHook
      • Java WeakHashMap
      • Java Reference Types
      • Java IdentityHashMap
      • How LinkedHashSet Works Internally in Java | Java ...
      • Premkumar Blogspot: Call Taxi Booking Application ...
      • Google Spanner Architecture
      • Google Bigtable Architecture
      • Google Calendar Architecture
      • Google Map Architecture
      • Network Misc
      • f4: Facebook's Warm BLOB Storage System
      • Path Finding
      • Snake Game AI
      • Snake Game Design
      • networking - Estimating distance between two serve...
      • Google File System
      • Apache Commons Source Code
      • 重复数据删除(De-duplication)技术研究 - 刘爱贵的专栏 - 博客频道 - CSDN.NET
      • Linux File System - superblock inode dentry
      • Taobao分布式文件系统TFS简析 - 刘爱贵的专栏 - 博客频道 - CSDN.NET
      • 【面试总结】Google Interview Phone Interview - wcq369201...
      • Java Copy On Write
      • Leetcode Shell
      • LeetCode 194 - Transpose File
      • Leetcode-192 Word Frequency
      • Big table | miafish
      • DynamoDB Architecture
      • Lamport & Vector Clocks | miafish
      • Application Design
      • Java Puzzles
      • Metrics & Monitoring
      • Monitoring with Graphite
      • Web Development Misc
      • LogStash
      • Java 8 Stream
      • NoSQL Misc
      • Android Tips and Tricks
      • Github Misc
      • Generate unique sequence number using distributed ...
      • LOSF(lots of small files) - MogileFS
      • Performance Tips
      • data serialization framework misc
      • JDK Collection Code
      • AngularJS Misc
      • Express - Node
      • Java TTL Cache
      • Achieving Rapid Response Times in Large Online Ser...
      • How to Build Team
      • Interview Questions Misc
      • Building Bug-Free Software
      • Writing Bug-Free Code During An Interview
      • Bug Free Programming
      • FizzBuzz
      • Defensive programming
      • Netty
    • ►  September (94)
    • ►  August (74)
    • ►  July (140)
    • ►  June (39)
    • ►  May (2)
    • ►  February (2)
    • ►  January (8)
  • ►  2014 (112)
    • ►  December (16)
    • ►  September (2)
    • ►  August (4)
    • ►  July (51)
    • ►  June (39)
  • ►  2013 (2)
    • ►  December (2)
  • ►  2011 (8)
    • ►  August (8)

Popular Posts

  • Archives
  • Design a chess game using OO principles | Runhe Tian Coding Practice
  • Implement a jigsaw puzzle ~ KodeKnight
  • thought-works: Object Oriented design for Elevator in a multi-storied apartment
  • How to design a tiny URL or URL shortener?
  • Snake Game Design
  • SSTable + LSM tree - Cassandra + LevelDB
  • Improving Your Chrome Browsing Productivity with Vimium
  • Scalability for Dummies: Le Cloud Blog
  • Finding a needle in Haystack: Facebook’s photo storage