web 3d图形渲染器
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

338 lines
12 KiB

  1. # Regenerate [![Build status](https://travis-ci.org/mathiasbynens/regenerate.svg?branch=master)](https://travis-ci.org/mathiasbynens/regenerate) [![Code coverage status](https://img.shields.io/codecov/c/github/mathiasbynens/regenerate.svg)](https://codecov.io/gh/mathiasbynens/regenerate)
  2. _Regenerate_ is a Unicode-aware regex generator for JavaScript. It allows you to easily generate ES5-compatible regular expressions based on a given set of Unicode symbols or code points. (This is trickier than you might think, because of [how JavaScript deals with astral symbols](https://mathiasbynens.be/notes/javascript-unicode).)
  3. ## Installation
  4. Via [npm](https://npmjs.org/):
  5. ```bash
  6. npm install regenerate
  7. ```
  8. Via [Bower](http://bower.io/):
  9. ```bash
  10. bower install regenerate
  11. ```
  12. In a browser:
  13. ```html
  14. <script src="regenerate.js"></script>
  15. ```
  16. In [Node.js](https://nodejs.org/), [io.js](https://iojs.org/), and [RingoJS ≥ v0.8.0](http://ringojs.org/):
  17. ```js
  18. var regenerate = require('regenerate');
  19. ```
  20. In [Narwhal](http://narwhaljs.org/) and [RingoJS ≤ v0.7.0](http://ringojs.org/):
  21. ```js
  22. var regenerate = require('regenerate').regenerate;
  23. ```
  24. In [Rhino](http://www.mozilla.org/rhino/):
  25. ```js
  26. load('regenerate.js');
  27. ```
  28. Using an AMD loader like [RequireJS](http://requirejs.org/):
  29. ```js
  30. require(
  31. {
  32. 'paths': {
  33. 'regenerate': 'path/to/regenerate'
  34. }
  35. },
  36. ['regenerate'],
  37. function(regenerate) {
  38. console.log(regenerate);
  39. }
  40. );
  41. ```
  42. ## API
  43. ### `regenerate(value1, value2, value3, ...)`
  44. The main Regenerate function. Calling this function creates a new set that gets a chainable API.
  45. ```js
  46. var set = regenerate()
  47. .addRange(0x60, 0x69) // add U+0060 to U+0069
  48. .remove(0x62, 0x64) // remove U+0062 and U+0064
  49. .add(0x1D306); // add U+1D306
  50. set.valueOf();
  51. // → [0x60, 0x61, 0x63, 0x65, 0x66, 0x67, 0x68, 0x69, 0x1D306]
  52. set.toString();
  53. // → '[`ace-i]|\\uD834\\uDF06'
  54. set.toRegExp();
  55. // → /[`ace-i]|\uD834\uDF06/
  56. ```
  57. Any arguments passed to `regenerate()` will be added to the set right away. Both code points (numbers) and symbols (strings consisting of a single Unicode symbol) are accepted, as well as arrays containing values of these types.
  58. ```js
  59. regenerate(0x1D306, 'A', '©', 0x2603).toString();
  60. // → '[A\\xA9\\u2603]|\\uD834\\uDF06'
  61. var items = [0x1D306, 'A', '©', 0x2603];
  62. regenerate(items).toString();
  63. // → '[A\\xA9\\u2603]|\\uD834\\uDF06'
  64. ```
  65. ### `regenerate.prototype.add(value1, value2, value3, ...)`
  66. Any arguments passed to `add()` are added to the set. Both code points (numbers) and symbols (strings consisting of a single Unicode symbol) are accepted, as well as arrays containing values of these types.
  67. ```js
  68. regenerate().add(0x1D306, 'A', '©', 0x2603).toString();
  69. // → '[A\\xA9\\u2603]|\\uD834\\uDF06'
  70. var items = [0x1D306, 'A', '©', 0x2603];
  71. regenerate().add(items).toString();
  72. // → '[A\\xA9\\u2603]|\\uD834\\uDF06'
  73. ```
  74. It’s also possible to pass in a Regenerate instance. Doing so adds all code points in that instance to the current set.
  75. ```js
  76. var set = regenerate(0x1D306, 'A');
  77. regenerate().add('©', 0x2603).add(set).toString();
  78. // → '[A\\xA9\\u2603]|\\uD834\\uDF06'
  79. ```
  80. Note that the initial call to `regenerate()` acts like `add()`. This allows you to create a new Regenerate instance and add some code points to it in one go:
  81. ```js
  82. regenerate(0x1D306, 'A', '©', 0x2603).toString();
  83. // → '[A\\xA9\\u2603]|\\uD834\\uDF06'
  84. ```
  85. ### `regenerate.prototype.remove(value1, value2, value3, ...)`
  86. Any arguments passed to `remove()` are removed from the set. Both code points (numbers) and symbols (strings consisting of a single Unicode symbol) are accepted, as well as arrays containing values of these types.
  87. ```js
  88. regenerate(0x1D306, 'A', '©', 0x2603).remove('☃').toString();
  89. // → '[A\\xA9]|\\uD834\\uDF06'
  90. ```
  91. It’s also possible to pass in a Regenerate instance. Doing so removes all code points in that instance from the current set.
  92. ```js
  93. var set = regenerate('☃');
  94. regenerate(0x1D306, 'A', '©', 0x2603).remove(set).toString();
  95. // → '[A\\xA9]|\\uD834\\uDF06'
  96. ```
  97. ### `regenerate.prototype.addRange(start, end)`
  98. Adds a range of code points from `start` to `end` (inclusive) to the set. Both code points (numbers) and symbols (strings consisting of a single Unicode symbol) are accepted.
  99. ```js
  100. regenerate(0x1D306).addRange(0x00, 0xFF).toString(16);
  101. // → '[\\0-\\xFF]|\\uD834\\uDF06'
  102. regenerate().addRange('A', 'z').toString();
  103. // → '[A-z]'
  104. ```
  105. ### `regenerate.prototype.removeRange(start, end)`
  106. Removes a range of code points from `start` to `end` (inclusive) from the set. Both code points (numbers) and symbols (strings consisting of a single Unicode symbol) are accepted.
  107. ```js
  108. regenerate()
  109. .addRange(0x000000, 0x10FFFF) // add all Unicode code points
  110. .removeRange('A', 'z') // remove all symbols from `A` to `z`
  111. .toString();
  112. // → '[\\0-@\\{-\\uD7FF\\uE000-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF](?![\\uDC00-\\uDFFF])|(?:[^\\uD800-\\uDBFF]|^)[\\uDC00-\\uDFFF]'
  113. regenerate()
  114. .addRange(0x000000, 0x10FFFF) // add all Unicode code points
  115. .removeRange(0x0041, 0x007A) // remove all code points from U+0041 to U+007A
  116. .toString();
  117. // → '[\\0-@\\{-\\uD7FF\\uE000-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF](?![\\uDC00-\\uDFFF])|(?:[^\\uD800-\\uDBFF]|^)[\\uDC00-\\uDFFF]'
  118. ```
  119. ### `regenerate.prototype.intersection(codePoints)`
  120. Removes any code points from the set that are not present in both the set and the given `codePoints` array. `codePoints` must be an array of numeric code point values, i.e. numbers.
  121. ```js
  122. regenerate()
  123. .addRange(0x00, 0xFF) // add extended ASCII code points
  124. .intersection([0x61, 0x69]) // remove all code points from the set except for these
  125. .toString();
  126. // → '[ai]'
  127. ```
  128. Instead of the `codePoints` array, it’s also possible to pass in a Regenerate instance.
  129. ```js
  130. var whitelist = regenerate(0x61, 0x69);
  131. regenerate()
  132. .addRange(0x00, 0xFF) // add extended ASCII code points
  133. .intersection(whitelist) // remove all code points from the set except for those in the `whitelist` set
  134. .toString();
  135. // → '[ai]'
  136. ```
  137. ### `regenerate.prototype.contains(value)`
  138. Returns `true` if the given value is part of the set, and `false` otherwise. Both code points (numbers) and symbols (strings consisting of a single Unicode symbol) are accepted.
  139. ```js
  140. var set = regenerate().addRange(0x00, 0xFF);
  141. set.contains('A');
  142. // → true
  143. set.contains(0x1D306);
  144. // → false
  145. ```
  146. ### `regenerate.prototype.clone()`
  147. Returns a clone of the current code point set. Any actions performed on the clone won’t mutate the original set.
  148. ```js
  149. var setA = regenerate(0x1D306);
  150. var setB = setA.clone().add(0x1F4A9);
  151. setA.toArray();
  152. // → [0x1D306]
  153. setB.toArray();
  154. // → [0x1D306, 0x1F4A9]
  155. ```
  156. ### `regenerate.prototype.toString(options)`
  157. Returns a string representing (part of) a regular expression that matches all the symbols mapped to the code points within the set.
  158. ```js
  159. regenerate(0x1D306, 0x1F4A9).toString();
  160. // → '\\uD834\\uDF06|\\uD83D\\uDCA9'
  161. ```
  162. If the `bmpOnly` property of the optional `options` object is set to `true`, the output matches surrogates individually, regardless of whether they’re lone surrogates or just part of a surrogate pair. This simplifies the output, but it can only be used in case you’re certain the strings it will be used on don’t contain any astral symbols.
  163. ```js
  164. var highSurrogates = regenerate().addRange(0xD800, 0xDBFF);
  165. highSurrogates.toString();
  166. // → '[\\uD800-\\uDBFF](?![\\uDC00-\\uDFFF])'
  167. highSurrogates.toString({ 'bmpOnly': true });
  168. // → '[\\uD800-\\uDBFF]'
  169. var lowSurrogates = regenerate().addRange(0xDC00, 0xDFFF);
  170. lowSurrogates.toString();
  171. // → '(?:[^\\uD800-\\uDBFF]|^)[\\uDC00-\\uDFFF]'
  172. lowSurrogates.toString({ 'bmpOnly': true });
  173. // → '[\\uDC00-\\uDFFF]'
  174. ```
  175. Note that lone low surrogates cannot be matched accurately using regular expressions in JavaScript without the use of [lookbehind assertions](https://mathiasbynens.be/notes/es-regexp-proposals#lookbehinds), which aren't yet widely supported. Regenerate’s output makes a best-effort approach but [there can be false negatives in this regard](https://github.com/mathiasbynens/regenerate/issues/28#issuecomment-72224808).
  176. If the `hasUnicodeFlag` property of the optional `options` object is set to `true`, the output makes use of Unicode code point escapes (`\u{…}`) where applicable. This simplifies the output at the cost of compatibility and portability, since it means the output can only be used as a pattern in a regular expression with [the ES6 `u` flag](https://mathiasbynens.be/notes/es6-unicode-regex) enabled.
  177. ```js
  178. var set = regenerate().addRange(0x0, 0x10FFFF);
  179. set.toString();
  180. // → '[\\0-\\uD7FF\\uE000-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF](?![\\uDC00-\\uDFFF])|(?:[^\\uD800-\\uDBFF]|^)[\\uDC00-\\uDFFF]''
  181. set.toString({ 'hasUnicodeFlag': true });
  182. // → '[\\0-\\u{10FFFF}]'
  183. ```
  184. ### `regenerate.prototype.toRegExp(flags = '')`
  185. Returns a regular expression that matches all the symbols mapped to the code points within the set. Optionally, you can pass [flags](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#Parameters) to be added to the regular expression.
  186. ```js
  187. var regex = regenerate(0x1D306, 0x1F4A9).toRegExp();
  188. // → /\uD834\uDF06|\uD83D\uDCA9/
  189. regex.test('𝌆');
  190. // → true
  191. regex.test('A');
  192. // → false
  193. // With flags:
  194. var regex = regenerate(0x1D306, 0x1F4A9).toRegExp('g');
  195. // → /\uD834\uDF06|\uD83D\uDCA9/g
  196. ```
  197. **Note:** This probably shouldn’t be used. Regenerate is intended as a tool that is used as part of a build process, not at runtime.
  198. ### `regenerate.prototype.valueOf()` or `regenerate.prototype.toArray()`
  199. Returns a sorted array of unique code points in the set.
  200. ```js
  201. regenerate(0x1D306)
  202. .addRange(0x60, 0x65)
  203. .add(0x59, 0x60) // note: 0x59 is added after 0x65, and 0x60 is a duplicate
  204. .valueOf();
  205. // → [0x59, 0x60, 0x61, 0x62, 0x63, 0x64, 0x65, 0x1D306]
  206. ```
  207. ### `regenerate.version`
  208. A string representing the semantic version number.
  209. ## Combine Regenerate with other libraries
  210. Regenerate gets even better when combined with other libraries such as [Punycode.js](https://mths.be/punycode). Here’s an example where [Punycode.js](https://mths.be/punycode) is used to convert a string into an array of code points, that is then passed on to Regenerate:
  211. ```js
  212. var regenerate = require('regenerate');
  213. var punycode = require('punycode');
  214. var string = 'Lorem ipsum dolor sit amet.';
  215. // Get an array of all code points used in the string:
  216. var codePoints = punycode.ucs2.decode(string);
  217. // Generate a regular expression that matches any of the symbols used in the string:
  218. regenerate(codePoints).toString();
  219. // → '[ \\.Ladeilmopr-u]'
  220. ```
  221. In ES6 you can do something similar with [`Array.from`](https://mths.be/array-from) which uses [the string’s iterator](https://mathiasbynens.be/notes/javascript-unicode#iterating-over-symbols) to split the given string into an array of strings that each contain a single symbol. [`regenerate()`](#regenerateprototypeaddvalue1-value2-value3-) accepts both strings and code points, remember?
  222. ```js
  223. var regenerate = require('regenerate');
  224. var string = 'Lorem ipsum dolor sit amet.';
  225. // Get an array of all symbols used in the string:
  226. var symbols = Array.from(string);
  227. // Generate a regular expression that matches any of the symbols used in the string:
  228. regenerate(symbols).toString();
  229. // → '[ \\.Ladeilmopr-u]'
  230. ```
  231. ## Support
  232. Regenerate supports at least Chrome 27+, Firefox 3+, Safari 4+, Opera 10+, IE 6+, Node.js v0.10.0+, io.js v1.0.0+, Narwhal 0.3.2+, RingoJS 0.8+, PhantomJS 1.9.0+, and Rhino 1.7RC4+.
  233. ## Unit tests & code coverage
  234. After cloning this repository, run `npm install` to install the dependencies needed for Regenerate development and testing. You may want to install Istanbul _globally_ using `npm install istanbul -g`.
  235. Once that’s done, you can run the unit tests in Node using `npm test` or `node tests/tests.js`. To run the tests in Rhino, Ringo, Narwhal, and web browsers as well, use `grunt test`.
  236. To generate the code coverage report, use `grunt cover`.
  237. ## Author
  238. | [![twitter/mathias](https://gravatar.com/avatar/24e08a9ea84deb17ae121074d0f17125?s=70)](https://twitter.com/mathias "Follow @mathias on Twitter") |
  239. |---|
  240. | [Mathias Bynens](https://mathiasbynens.be/) |
  241. ## License
  242. Regenerate is available under the [MIT](https://mths.be/mit) license.