-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathen.search-data.min.f5002dc237d7f2ffe15a5d7029b5f8fd4fcda32c152132c0b5d3a044eb283fe6.js
1 lines (1 loc) · 77 KB
/
en.search-data.min.f5002dc237d7f2ffe15a5d7029b5f8fd4fcda32c152132c0b5d3a044eb283fe6.js
1
'use strict';(function(){const indexCfg={cache:true};indexCfg.doc={id:'id',field:['title','content'],store:['title','href','section'],};const index=FlexSearch.create('balance',indexCfg);window.bookSearchIndex=index;index.add({'id':0,'href':'/docs/references/aggregation/avg/','title':"Avg aggregation",'section':"Aggregation",'content':"Avg aggregation # A single-value metrics aggregation that computes the average of numeric values that are extracted from the aggregated documents.\nExamples # Assuming the data consists of documents representing exams grades (between 0 and 100) of students we can average their scores with:\nPOST /exams/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;avg_grade\u0026#34;: { \u0026#34;avg\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;grade\u0026#34; } } } } The above aggregation computes the average grade over all documents. The aggregation type is avg and the field setting defines the numeric field of the documents the average will be computed on. The above will return the following:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;avg_grade\u0026#34;: { \u0026#34;value\u0026#34;: 75.0 } } } The name of the aggregation (avg_grade above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. "});index.add({'id':1,'href':'/docs/administration/observibility/health/','title':"Cluster health",'section':"Observability",'content':"Cluster health # Returns the cluster health for quick overview.\nGet the cluster health # Requests # GET /_cluster/health "});index.add({'id':2,'href':'/docs/overview/concept/','title':"Concepts",'section':"Overview",'content':"Concepts # Pizza is a distributed search engine designed to efficiently index and retrieve documents across large-scale datasets. It organizes data in a hierarchical structure, allowing for flexible management and retrieval capabilities.\nBefore you start using Pizza, familiarize yourself with the following key concepts:\n [Pizza Concepts] Concepts # Cluster # A cluster represents a set of interconnected nodes that collectively form the Pizza search engine. Nodes within a cluster collaborate to store and process data efficiently. Clusters can span multiple physical locations for fault tolerance and scalability.\nZone # A zone is a logical grouping of nodes within a cluster. Zones are typically organized based on geographic proximity or network topology. They facilitate data replication and fault tolerance strategies by ensuring redundancy across different zones.\nRegion # A region is a further subdivision within a zone, typically representing a smaller geographical area or a distinct network segment. Regions help optimize data access and reduce latency by distributing data closer to users or applications.\nNamespace # Pizza support multi-tenant by design. A namespace is a logical container for collections of related data. It serves as a namespace for collections, providing isolation and organization. Namespaces can be used to group data according to different criteria such as application domain, user, or data type.\nCollection # A collection is a grouping of documents with similar characteristics or attributes. Collections represent the primary unit of storage and retrieval within Pizza. Each collection is vertically partitioned into \u0026ldquo;rollings\u0026rdquo; to efficiently manage large datasets.\nRolling # A rolling is a vertical partition of the entire collection dataset. Each rolling contains a subset of documents, with a maximum limit of 4.2 billion documents per rolling. Documents within a rolling are assigned an auto-increment sequence document ID, which is a uint32. Once a rolling is filled, the next rolling is automatically assigned, ensuring infinite scalability.\nPartition # A partition is a logical split and separation of data within a single rolling. Fixed at 256 partitions per rolling, partitions enable horizontal scalability and performance optimization by distributing data across multiple shards. Partitions are dynamically mapped to shards and can be scaled out or merged for better search performance.\nShard # A shard is a physical container for partitions within a single rolling of a collection. Each rolling can have a different setup of shards, allowing for customized scalability and performance optimization. Shards contain partitions within a single rolling, enabling efficient data distribution and retrieval.\nDocument # A document represents a unit of data indexed by the Pizza search engine. Documents can be of various types, such as text, images, or structured data. Each document contains fields that store specific attributes or properties, making it searchable and retrievable.\nField # A field is a specific attribute or property of a document. Fields contain the actual data that is indexed and searched within documents. Examples of fields include title, content, author, date, etc.\nStore # In Pizza, the \u0026ldquo;Store\u0026rdquo; refers to the primary storage for documents, also known as forward records. By default, it utilizes Parquet for storage, with the option to integrate other external storage types in the future.\nIndex # An index is a data structure used to efficiently retrieve documents based on search queries. It maps terms or keywords to the documents containing those terms, enabling fast lookup and retrieval. Indices are built and maintained based on the fields within documents.\nRelationships # Cluster to Zone/Region # A cluster consists of one or more zones, which may further contain multiple regions. Zones and regions facilitate data replication and fault tolerance strategies within the cluster.\nNamespace to Collection # Namespaces contain one or more collections, providing a logical grouping for related data. Collections within the same namespace share common management and access policies.\nCollection to Rolling # As data within a Collection grows, it\u0026rsquo;s vertically partitioned into Rollings to manage large datasets efficiently. Rollings represent vertical partitions of a Collection\u0026rsquo;s dataset, allowing dynamic scaling and efficient querying of subsets of data.\nRolling to Partition # Single Rolling are horizontally partitioned into 256 partitions, each containing a subset of documents. Partitions enable horizontal scalability and performance optimization within a rolling.\nPartition to Shard # Partitions are dynamically mapped to shards within a collection. Shards can scale out to multiple shards or merge back into a single shard for improved search performance and resource utilization.\nDocument to Field # Documents consist of fields that store specific attributes or properties. Fields enable structured indexing and searching of documents based on their content.\nStore to Index # The store persists documents and associated metadata, while indices facilitate efficient retrieval of documents based on search queries. Stores and indices work together to provide fast and reliable data access within the Pizza search engine.\n"});index.add({'id':3,'href':'/docs/references/collection/create/','title':"Create a collection",'section':"Collection",'content':"Create a collection # Creates a new collection.\nExamples # The following request creates a new collection called my-collection in the namespace my-namespace:\nPUT /my-namespace:my-collection If creating a collection within the default namespace, it can be simplified as:\nPUT /my-collection Request # PUT /[\u0026lt;namespace\u0026gt;:]\u0026lt;name\u0026gt; Path parameters # \u0026lt;namespace\u0026gt;\n(Optional, string) The namespace which the collection belongs to. Namespace names must meet the following criteria: Lowercase only Cannot include \\ /, *, ?, \u0026quot;, \u0026lt;, \u0026gt;, |, , ,, # Cannot start with -, _, + Cannot be . or .. Cannot be longer than 255 bytes (note it is bytes, so multi-byte characters will count towards the 255 limit faster) \u0026lt;name\u0026gt;\n(Required, string) Name of the collection you wish to create. Collection names must meet the same criteria as namespace names. Request body # settings\nCollection settings\n collection.partitions_sharding_strategy\n(Optional, object) Specifies the initial number of primary shards and how partitions are assigned to them for every rollings in this collection.\nBy default, only 1 shard will be created, all the partitions will be assigned to it.\nSupported strategies are listed below:\n Hash\n\u0026#34;hash\u0026#34;: { \u0026#34;number_of_shards\u0026#34;: \u0026lt;num_shards\u0026gt; } One can specify the total number of shards, partitions are assigned to shards in this way:\nshard_index = hash(partition_id) mod number_of_shards Range\n\u0026#34;range\u0026#34;: [\u0026#34;0..128, 129..255\u0026#34;] An array of partition range needs to be provided, every array item represents a shard, the partitions specified in it will be assigned to the shard.\nThe above example evenly assigns 256 partitions to 2 shards.\n collection.partitions_sharding_strategy\n(Optional, Integer) The number of replicas each primary shard has. Defaults to 1.\n "});index.add({'id':4,'href':'/docs/references/document/create/','title':"Create a document",'section':"Document",'content':"Create a document # Creates a new document.\nExamples # Insert a JSON document into the my-collection collection:\nPOST /my-collection/_doc { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infini\u0026#34; } } The API returns the following result:\n{ \u0026#34;_id\u0026#34;: \u0026#34;0,0\u0026#34;, \u0026#34;_version\u0026#34;: 1, \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;my-collection\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;created\u0026#34;, ... } The API supports passing a customized UUID as the document identify, eg:\nPOST /my-collection/_doc/news_001 { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infini\u0026#34; } } Request # POST /\u0026lt;target\u0026gt;/_doc/[\u0026lt;doc_id\u0026gt;] {\u0026lt;fields\u0026gt;} Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Optional, string) The unique identify of the document, auto generated if not specified. Request body # \u0026lt;fields\u0026gt;\n(Required, string) Request body contains the JSON source for the document data. "});index.add({'id':5,'href':'/docs/references/namespace/create/','title':"Create a namespace",'section':"Namespace",'content':"Create a namespace # Creates a new namespace.\nExamples # If creating a website namespace, the following request creates a new namespace called website:\nPUT /_namespace/website Request # PUT /_namespace/\u0026lt;name\u0026gt; Path parameters # \u0026lt;name\u0026gt;\n(Required, string) The name of the namespace. Namespace names must meet the following criteria: Lowercase only Cannot include \\ /, *, ?, \u0026quot;, \u0026lt;, \u0026gt;, |, , ,, # Cannot start with -, _, + Cannot be . or .. Cannot be longer than 255 bytes (note it is bytes, so multi-byte characters will count towards the 255 limit faster) "});index.add({'id':6,'href':'/docs/references/index/create/','title':"Create an index",'section':"Index",'content':"Create an index # Creates a new index under a collectioin.\nExamples # The following request creates a new index called my-index under collection my-namespace:my-collection\nPUT /my-namespace:my-collection/_index/my_index Request # PUT /\u0026lt;target\u0026gt;/_index/\u0026lt;name\u0026gt; Path parameters # \u0026lt;target\u0026gt;\n(Required, string) The collection which the index belongs to.\n \u0026lt;name\u0026gt;\n(Required, string) Name of the index you wish to create.\n "});index.add({'id':7,'href':'/docs/references/aggregation/date-histogram/','title':"Date histogram aggregation",'section':"Aggregation",'content':"Date histogram aggregation # This multi-bucket aggregation is similar to the normal histogram, but it can only be used with date or date range values. Because dates are represented internally in Elasticsearch as long values, it is possible, but not as accurate, to use the normal histogram on dates as well. The main difference in the two APIs is that here the interval can be specified using date/time expressions. Time-based data requires special support because time-based intervals are not always a fixed length.\nExamples # As an example, here is an aggregation requesting bucket intervals of a month in calendar time:\nPOST /sales/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;sales_over_time\u0026#34;: { \u0026#34;date_histogram\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;date\u0026#34;, \u0026#34;calendar_interval\u0026#34;: \u0026#34;1M\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;sales_over_time\u0026#34;: { \u0026#34;buckets\u0026#34;: [ { \u0026#34;key\u0026#34;: 1420070400000, \u0026#34;doc_count\u0026#34;: 3 }, { \u0026#34;key\u0026#34;: 1422748800000, \u0026#34;doc_count\u0026#34;: 2 }, { \u0026#34;key\u0026#34;: 1425168000000, \u0026#34;doc_count\u0026#34;: 2 } ] } } } Parameters for date_histogram # field\n(Required, string) Field you wish to aggregate. calendar_interval # (Optional, string) Calendar-aware intervals are configured with the calendar_interval parameter. You can specify calendar intervals using the unit name, such as month, or as a single unit quantity, such as 1M. For example, day and 1d are equivalent. Multiple quantities, such as 2d, are not supported.\nThe accepted calendar intervals are:\n minute, 1m\nAll minutes begin at 00 seconds. One minute is the interval between 00 seconds of the first minute and 00 seconds of the following minute in the specified time zone, compensating for any intervening leap seconds, so that the number of minutes and seconds past the hour is the same at the start and end. hour, 1h\nAll hours begin at 00 minutes and 00 seconds. One hour (1h) is the interval between 00:00 minutes of the first hour and 00:00 minutes of the following hour in the specified time zone, compensating for any intervening leap seconds, so that the number of minutes and seconds past the hour is the same at the start and end. day, 1d\nAll days begin at the earliest possible time, which is usually 00:00:00 (midnight). One day (1d) is the interval between the start of the day and the start of the following day in the specified time zone, compensating for any intervening time changes. week, 1w\nOne week is the interval between the start day_of_week:hour:minute:second and the same day of the week and time of the following week in the specified time zone. month, 1M\nOne month is the interval between the start day of the month and time of day and the same day of the month and time of the following month in the specified time zone, so that the day of the month and time of day are the same at the start and end. quarter, 1q\nOne quarter is the interval between the start day of the month and time of day and the same day of the month and time of day three months later, so that the day of the month and time of day are the same at the start and end. year, 1y\nOne year is the interval between the start day of the month and time of day and the same day of the month and time of day the following year in the specified time zone, so that the date and time are the same at the start and end. fixed_interval # Fixed intervals are configured with the fixed_interval parameter.\nIn contrast to calendar-aware intervals, fixed intervals are a fixed number of SI units and never deviate, regardless of where they fall on the calendar. One second is always composed of 1000ms. This allows fixed intervals to be specified in any multiple of the supported units.\nHowever, it means fixed intervals cannot express other units such as months, since the duration of a month is not a fixed quantity. Attempting to specify a calendar interval like month or quarter will throw an exception.\nThe accepted units for fixed intervals are:\n milliseconds (ms)\nA single millisecond. This is a very, very small interval. seconds (s)\nDefined as 1000 milliseconds each. minutes (m)\nDefined as 60 seconds each (60,000 milliseconds). All minutes begin at 00 seconds. hours (h)\nDefined as 60 minutes each (3,600,000 milliseconds). All hours begin at 00 minutes and 00 seconds. days (d)\nDefined as 24 hours (86,400,000 milliseconds). All days begin at the earliest possible time, which is usually 00:00:00 (midnight). "});index.add({'id':8,'href':'/docs/references/index/delete/','title':"Delete an index",'section':"Index",'content':"Delete an index # Deletes an existing index under a collectioin.\nExamples # The following request deletes the index called my-index under collection my-namespace:my-collection\nDELETE /my-namespace:my-collection/_index/my_index Request # DELETE /\u0026lt;target\u0026gt;/_index/\u0026lt;name\u0026gt; Path parameters # \u0026lt;target\u0026gt;\n(Required, string) The collection which the index will be removed from.\n \u0026lt;name\u0026gt;\n(Required, string) Name of the index you wish to delete.\n "});index.add({'id':9,'href':'/docs/getting-started/installation/','title':"Installation",'section':"Getting started",'content':"Installation # Pizza is compatible with all major operating systems. The package is compiled statically, and it does not require any external dependencies.\nAutomatic installation # Use the following command to automatically download the latest version of INFINI Pizza for your platform and extract it into /opt/pizza:\ncurl -sSL http://get.infini.cloud | bash -s -- -p pizza The optional parameters for the script are as follows:\n -v \u0026lt;version number\u0026gt; (default is the latest version) -d \u0026lt;installation directory\u0026gt; (default is /opt/pizza) Manual installation # Visit the URL below to download the package for your operating system:\nhttps://release.infinilabs.com/\nVerification of the installation # Assuming Pizza is in your $PATH after installation, run the following command to ensure the package has been installed correctly:\n$ pizza --version PIZZA 0.1.0 Starting the server # Start Pizza as follows with the configuration:\n$ pizza --config pizza.yaml ___ _____ __________ _ / _ \\\\_ \\/ _ / _ / /_\\ / /_)/ / /\\/\\// /\\// / //_\\\\ / ___/\\/ /_ / //\\/ //\\/ _ \\ \\/ \\____/ /____/____/\\_/ \\_/ [PIZZA] The Next-Gen Real-Time Hybrid Search \u0026amp; AI-Native Innovation Engine. ... Interaction with the server # Assuming Pizza is listening on 127.0.0.1:9200, use the following command to create a collection named testing:\ncurl -XPUT http://127.0.0.1:9200/testing Refer to the reference page for more APIs.\nShutdown the server # Press Ctrl+C to shut down Pizza, and the message below is displayed:\n... __ _ __ ____ __ _ __ __ / // |/ // __// // |/ // / / // || // _/ / // || // / /_//_/|_//_/ /_//_/|_//_/ ©INFINI.LTD, All Rights Reserved. "});index.add({'id':10,'href':'/docs/references/aggregation/max/','title':"Max aggregation",'section':"Aggregation",'content':"Max aggregation # A single-value metrics aggregation that keeps track and returns the maximum value among the numeric values extracted from the aggregated documents.\nExamples # Computing the max price value across all documents:\nPOST /sales/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;max_price\u0026#34;: { \u0026#34;max\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;price\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;max_price\u0026#34;: { \u0026#34;value\u0026#34;: 200.0 } } } As can be seen, the name of the aggregation (max_price above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. "});index.add({'id':11,'href':'/docs/references/aggregation/min/','title':"Min aggregation",'section':"Aggregation",'content':"Min aggregation # A single-value metrics aggregation that keeps track and returns the minimum value among numeric values extracted from the aggregated documents.\nExamples # Computing the min price value across all documents:\nPOST /sales/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;min_price\u0026#34;: { \u0026#34;min\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;price\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;min_price\u0026#34;: { \u0026#34;value\u0026#34;: 10.0 } } } As can be seen, the name of the aggregation (min_price above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. "});index.add({'id':12,'href':'/docs/references/namespace/','title':"Namespace",'section':"References",'content':"Namespace # Pizza supports a multi-tenant architecture, allowing different sets of data for various scenarios to be stored within a single engine. Each set is referred to as a namespace, and different namespaces can have distinct topologies and access permissions configured.\nUsually, there\u0026rsquo;s no need to set up an additional namespace, and the default namespace is default.\nWithin a namespace, there are several types of data:\n Collection, Docs: Collections of documents. Data: Source data, stored in columns layout. Index: Indexed data, built based on Data, optional. View: View data, composite views of data across collections. Namespace management # Namespace APIs are used to manage individual Namespace and settings.\n Create a namespace Delete a namespace "});index.add({'id':13,'href':'/docs/overview/','title':"Overview",'section':"Documentation",'content':"Overview # Introduction # INFINI Pizza is a distributed hybrid search database system. Our mission is to deliver real-time smart search experiences tailored for enterprises by fully harnessing the potential of modern hardware and the AI capability. We are committed to meeting the demands of high concurrency and high throughput in challenging environments, all while providing seamless and efficient search capabilities.\nFeatures # The Next-Gen Real-Time Search \u0026amp; AI-Native Innovation Engine Written in Rust.\n Major Features of Pizza:\n True Real-Time, get search results instantly after insertion, no need to refresh anymore. Support partial update in place, no longer pull and push back the entire document again. High performance, lightning fast with high throughput and low latency, hardware reduced. High scalability, supports very large-scale clusters, beyond petabytes. Native integration with LLMs and ML, empowering AI-Native enterprise innovation. Design with storage and computation separation, and also storage and index separation. Architecture # Pizza employs a share-nothing architecture, designed for modern hardware, ensuring complete isolation of resources at both the node and per-CPU level. Pizza embraces a fully asynchronous manner to access I/O and network resources, unleash the power of multiple cores, large memory and NVME SSD.\n Learn more about Pizza\u0026rsquo;s architecture.\nWhy Pizza # The name Pizza was taken from our unique sharding design.\nThe documents in Pizza are persisted as Parquet files in object storage. Native integration with other big data systems through object storage and the standard Parquet format.\nWhen to use Pizza # Pizza is good fit when:\n You have latency-sensitive search applications that millisecond matters. You need fresh data, your data is mutable, and you need fast queries. You need to handle high concurrency with complex queries. You need to handle more than petabytes data for user-facing use cases. You need to handle JOIN for complex data relations. You need to keep thousands of fields, but only a handful are subject to change. You need to manage both structured and unstructured data in a cohesive manner. Pizza is designed to address these problems at heart, to solve real critical business issues, serve your data-driven applications in realtime at very large scale. Enhance and enrich the data experiences of your end-users.\nDesign choices # The philosophy of Pizza is that indices should be designed per use case, and should not attempt to fit every use case with a single index. Therefore, we introduced Views, which allow combining different document sources into a single index or separating a document into different layers of indices for different use cases.\nBy emphasizes the decoupling of storage and computation, as well as the separation of storage and index. Which enables efficient and scalable data processing by allowing independent management and optimization of storage resources, computational resources, and indexing strategies.\nNative integration with LLMs (Language Models) and ML (Machine Learning) technologies is a key aspect of Pizza, providing powerful capabilities for AI-Native enterprise innovation. By seamlessly integrating with LLMs and ML frameworks, Pizza enables advanced natural language processing, machine learning, and data analytics directly within the search and data retrieval pipeline.\nWe are in the process of building the next-generation search infrastructure, driven by our unwavering commitment to delivering real-time search experiences for enterprises, unlocking the potential of modern hardware, and catering to the demands of high concurrency and high throughput in the most challenging of environments.\nNext step # Install and configure Pizza.\n"});index.add({'id':14,'href':'/docs/references/aggregation/percentiles/','title':"Percentiles aggregation",'section':"Aggregation",'content':"Percentiles aggregation # A multi-value metrics aggregation that calculates one or more percentiles over numeric values extracted from the aggregated documents.\nPercentiles show the point at which a certain percentage of observed values occur. For example, the 95th percentile is the value which is greater than 95% of the observed values.\nPercentiles are often used to find outliers. In normal distributions, the 0.13th and 99.87th percentiles represents three standard deviations from the mean. Any data which falls outside three standard deviations is often considered an anomaly.\nWhen a range of percentiles are retrieved, they can be used to estimate the data distribution and determine if the data is skewed, bimodal, etc.\nExamples # Assume your data consists of website load times. The average and median load times are not overly useful to an administrator. The max may be interesting, but it can be easily skewed by a single slow response.\nLet\u0026rsquo;s look at a range of percentiles representing load time:\nPOST latency/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;percentiles\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;load_time\u0026#34; } } } } By default, the percentile metric will generate a range of percentiles: [1, 5, 25, 50, 75, 95, 99]. The response will look like this:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;values\u0026#34;: { \u0026#34;1.0\u0026#34;: 10.0, \u0026#34;5.0\u0026#34;: 30.0, \u0026#34;25.0\u0026#34;: 170.0, \u0026#34;50.0\u0026#34;: 445.0, \u0026#34;75.0\u0026#34;: 720.0, \u0026#34;95.0\u0026#34;: 940.0, \u0026#34;99.0\u0026#34;: 980.0 } } } } As you can see, the aggregation will return a calculated value for each percentile in the default range. If we assume response times are in milliseconds, it is immediately obvious that the webpage normally loads in 10-725ms, but occasionally spikes to 945-985ms.\nOften, administrators are only interested in outliers — the extreme percentiles. We can specify just the percents we are interested in (requested percentiles must be a value between 0-100 inclusive):\nPOST latency/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;percentiles\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;load_time\u0026#34;, \u0026#34;percents\u0026#34;: [95, 99, 99.9] } } } } Parameters for avg # field\n(Required, string) Field you wish to aggregate. percents\n(Optional, array) A range of percentiles that are calculated. Default is [1, 5, 25, 50, 75, 95, 99]. keyed # By default the keyed flag is set to true which associates a unique string key with each bucket and returns the ranges as a hash rather than an array. Setting the keyed flag to false will disable this behavior:\nPOST latency/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;percentiles\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;load_time\u0026#34;, \u0026#34;keyed\u0026#34;: false } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;values\u0026#34;: [ { \u0026#34;key\u0026#34;: 1.0, \u0026#34;value\u0026#34;: 10.0 }, { \u0026#34;key\u0026#34;: 5.0, \u0026#34;value\u0026#34;: 30.0 }, { \u0026#34;key\u0026#34;: 25.0, \u0026#34;value\u0026#34;: 170.0 }, { \u0026#34;key\u0026#34;: 50.0, \u0026#34;value\u0026#34;: 445.0 }, { \u0026#34;key\u0026#34;: 75.0, \u0026#34;value\u0026#34;: 720.0 }, { \u0026#34;key\u0026#34;: 95.0, \u0026#34;value\u0026#34;: 940.0 }, { \u0026#34;key\u0026#34;: 99.0, \u0026#34;value\u0026#34;: 980.0 } ] } } } "});index.add({'id':15,'href':'/docs/references/search/prefix/','title':"Prefix query",'section':"Search",'content':"Prefix query # Returns documents that contain a specific prefix in a provided field.\nExamples # The following search returns documents where the org.id field contains a term that begins with inf.\nGET /_search { \u0026#34;query\u0026#34;: { \u0026#34;prefix\u0026#34;: { \u0026#34;org.id\u0026#34;: { \u0026#34;value\u0026#34;: \u0026#34;inf\u0026#34; } } } } Top-level parameters for prefix # \u0026lt;field\u0026gt;\n(Required, object) Field you wish to search. Parameters for \u0026lt;field\u0026gt; # value\n(Required, string) Beginning characters of terms you wish to find in the provided \u0026lt;field\u0026gt;. case_insensitive\n(Optional, Boolean) Allows ASCII case insensitive matching of the value with the indexed field values when set to true. Default is false. "});index.add({'id':16,'href':'/docs/references/search/range/','title':"Range query",'section':"Search",'content':"Range query # Returns documents that contain terms within a provided range.\nExamples # The following search returns documents where the age field contains a term between 10 and 20.\nGET /_search { \u0026#34;query\u0026#34;: { \u0026#34;range\u0026#34;: { \u0026#34;age\u0026#34;: { \u0026#34;gte\u0026#34;: 10, \u0026#34;lte\u0026#34;: 20 } } } } Top-level parameters for range # \u0026lt;field\u0026gt;\n(Required, object) Field you wish to search. Parameters for \u0026lt;field\u0026gt; # gt\n(Optional) Greater than. gte\n(Optional) Greater than or equal to. lt\n(Optional) Less than. lte\n(Optional) Less than or equal to. "});index.add({'id':17,'href':'/docs/references/search/regexp/','title':"Regexp query",'section':"Search",'content':"Regexp query # Returns documents that contain terms matching a regular expression.\nA regular expression is a way to match patterns in data using placeholder characters, called operators. For a list of operators supported by the regexp query, see Regular expression syntax.\nExamples # The following search returns documents where the org.id field contains any term that begins with in and ends with y. The .* operators match any characters of any length, including no characters. Matching terms can include ini, inni, and infini.\nGET /_search { \u0026#34;query\u0026#34;: { \u0026#34;regexp\u0026#34;: { \u0026#34;org.id\u0026#34;: { \u0026#34;value\u0026#34;: \u0026#34;in.*i\u0026#34;, \u0026#34;case_insensitive\u0026#34;: true } } } } Top-level parameters for range # \u0026lt;field\u0026gt;\n(Required, object) Field you wish to search. Parameters for \u0026lt;field\u0026gt; # value\n(Required, string) Regular expression for terms you wish to find in the provided \u0026lt;field\u0026gt;. For a list of supported operators, see Regular expression syntax. case_insensitive\n(Optional, Boolean) Allows ASCII case insensitive matching of the value with the indexed field values when set to true. Default is false. "});index.add({'id':18,'href':'/docs/references/aggregation/sum/','title':"Sum aggregation",'section':"Aggregation",'content':"Sum aggregation # A single-value metrics aggregation that sums up numeric values that are extracted from the aggregated documents.\nExamples # Assuming the data consists of documents representing sales records we can sum the sale price of all hats with:\nPOST /sales/_search { \u0026#34;query\u0026#34;: { \u0026#34;constant_score\u0026#34;: { \u0026#34;filter\u0026#34;: { \u0026#34;match\u0026#34;: { \u0026#34;type\u0026#34;: \u0026#34;hat\u0026#34; } } } }, \u0026#34;aggs\u0026#34;: { \u0026#34;hat_prices\u0026#34;: { \u0026#34;sum\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;price\u0026#34; } } } } Resulting in:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;hat_prices\u0026#34;: { \u0026#34;value\u0026#34;: 450.0 } } } The name of the aggregation (hat_prices above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. "});index.add({'id':19,'href':'/docs/references/search/term/','title':"Term query",'section':"Search",'content':"Term query # Returns documents that contain an exact term in a provided field.\nYou can use the term query to find documents based on a precise value such as a price, a product ID, or a username.\nExamples # GET /_search { \u0026#34;query\u0026#34;: { \u0026#34;term\u0026#34;: { \u0026#34;org.id\u0026#34;: { \u0026#34;value\u0026#34;: \u0026#34;infini\u0026#34; } } } } Top-level parameters for term # \u0026lt;field\u0026gt;\n(Required, object) Field you wish to search. Parameters for \u0026lt;field\u0026gt; # value\n(Required, string) Term you wish to find in the provided \u0026lt;field\u0026gt;. To return a document, the term must exactly match the field value, including whitespace and capitalization. case_insensitive\n(Optional, Boolean) Allows ASCII case insensitive matching of the value with the indexed field values when set to true. Default is false. "});index.add({'id':20,'href':'/docs/references/aggregation/terms/','title':"Terms aggregation",'section':"Aggregation",'content':"Terms aggregation # A multi-bucket value source based aggregation where buckets are dynamically built - one per unique value.\nExamples # POST /_search { \u0026#34;aggs\u0026#34;: { \u0026#34;genres\u0026#34;: { \u0026#34;terms\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;genre\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;genres\u0026#34;: { \u0026#34;doc_count_error_upper_bound\u0026#34;: 0, \u0026#34;sum_other_doc_count\u0026#34;: 0, \u0026#34;buckets\u0026#34;: [ { \u0026#34;key\u0026#34;: \u0026#34;electronic\u0026#34;, \u0026#34;doc_count\u0026#34;: 6 }, { \u0026#34;key\u0026#34;: \u0026#34;rock\u0026#34;, \u0026#34;doc_count\u0026#34;: 3 }, { \u0026#34;key\u0026#34;: \u0026#34;jazz\u0026#34;, \u0026#34;doc_count\u0026#34;: 2 } ] } } } Parameters for terms # field\n(Required, string) Field you wish to aggregate. "});index.add({'id':21,'href':'/docs/references/aggregation/value-count/','title':"Value count aggregation",'section':"Aggregation",'content':"Value count aggregation # A single-value metrics aggregation that counts the number of values that are extracted from the aggregated documents. Typically, this aggregator will be used in conjunction with other single-value aggregations. For example, when computing the avg one might be interested in the number of values the average is computed over.\nvalue_count does not de-duplicate values, so even if a field has duplicates each value will be counted individually.\nExamples # Assuming the data consists of documents representing sales records we can sum the sale price of all hats with:\nPOST /sales/_search { \u0026#34;aggs\u0026#34; : { \u0026#34;types_count\u0026#34; : { \u0026#34;value_count\u0026#34; : { \u0026#34;field\u0026#34; : \u0026#34;type\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;types_count\u0026#34;: { \u0026#34;value\u0026#34;: 7 } } } The name of the aggregation (types_count above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. "});index.add({'id':22,'href':'/docs/administration/observibility/state/','title':"Cluster state",'section':"Observability",'content':"Cluster state # Returns an internal representation of the cluster state for debugging or diagnostic purposes.\nGet the whole cluster state # Requests # GET /_cluster/state/\u0026lt;names\u0026gt; Path Parameters # names\n(Optional, string) A comma-separated list of the following options:\n _all\nShows all names. blocks\nShows the blocks part of the response. leader_node\nShows the leader_node part of the response. metadata\nShows the metadata part of the response. nodes\nShows the nodes part of the response. routing_nodes\nShows the routing_nodes part of the response. routing_table\nShows the routing_table part of the response. version\nShows the cluster state version. Get the state of a specific region # Requests # GET /_cluster/_region/\u0026lt;region_id\u0026gt;/state/\u0026lt;names\u0026gt; Path parameters # region_id\n(Required, String) The UUID of the region you want to query. A special ID _local can be specified to query the state of the region that handles this request.\n names\n(Optional, string) A comma-separated list of options, see the names parameter of the cluster state API for the full list of options.\n "});index.add({'id':23,'href':'/docs/references/collection/','title':"Collection",'section':"References",'content':"Collection # A \u0026ldquo;Collection\u0026rdquo; typically refers to a grouping or container for related data items in a database or similar data storage system. It can hold various types of data, such as documents, records, or other structured data elements. In the context of the previous discussion about namespaces and data types, a collection could contain documents, each representing a specific piece of information or record.\nCollection management # Collection APIs are used to manage individual collections and settings.\n Create a collection Delete a collection Get collection Get collection schema Get collection settings Get collection index "});index.add({'id':24,'href':'/docs/getting-started/configuration/','title':"Configuration",'section':"Getting started",'content':"Configuration # Pizza supports several methods to overwrite the default configuration.\nCommand lines # ➜ ./bin/pizza --help A Distributed Real-Time Search \u0026amp; AI-Native Innovation Engine. Usage: pizza [OPTIONS] [COMMAND] Commands: service Builtin service management (install, uninstall, start, stop) help Print this message or the help of the given subcommand(s) Options: -l, --log \u0026lt;LEVEL\u0026gt; Set the logging level, options: trace,debug,info,warn,error --debug Run in debug mode, panic immediately with full stack trace -c, --config \u0026lt;FILE\u0026gt; -p, --pid \u0026lt;FILE\u0026gt; Place pid to this file -E, --override \u0026lt;KEY=VALUE\u0026gt; -h, --help Print help -V, --version Print version Configuration file # You can fully customize Pizza by utilizing the pizza.yaml configuration file:\n# ======================== INFINI Pizza Configuration ========================== # -------------------------------- Log ----------------------------------------- log: level: info # -------------------------------- API ----------------------------------------- gateway: network: binding: 127.0.0.1:9100 skip_occupied_port: true # -------------------------------- Cluster ------------------------------------- cluster: name: pizza node: name: my_node_1 network: binding: 127.0.0.1:8100 skip_occupied_port: true # -------------------------------- Storage ------------------------------------- storage: compression: ZSTD # -------------------------------- MemTable ------------------------------------ memtable: threshold: 1k max_num_of_instance: 2 allow_multi_instance: true Override configuration # You can tweak the configuration by passing the command line option -E with KEY=VALUE style during Pizza start:\n./bin/pizza -E log.level=trace -E gateway.network.binding=127.0.0.1:12200 "});index.add({'id':25,'href':'/docs/references/collection/delete/','title':"Delete a collection",'section':"Collection",'content':"Delete a collection # Delete a exists collection.\nExamples # The following request deletes the collection called my-collection:\nDELETE my-collection Request # PUT /[\u0026lt;namespace\u0026gt;:]\u0026lt;name\u0026gt; Path Parameters # \u0026lt;namespace\u0026gt;\n(Optional, string) The namespace which the collection belongs to. \u0026lt;name\u0026gt;\n(Required, string) Name of the collection you wish to create. "});index.add({'id':26,'href':'/docs/references/namespace/delete/','title':"Delete a namespace",'section':"Namespace",'content':"Delete a namespace # Delete a exists namespace.\nExamples # The following request delete the namespace called website:\nDELETE /_namespace/website Request # DELETE /_namespace/\u0026lt;name\u0026gt; Path parameters # \u0026lt;name\u0026gt;\n(Optional, string) The name of the namespace that you want to delete. "});index.add({'id':27,'href':'/docs/references/collection/get/','title':"Get collection",'section':"Collection",'content':"Get collection # Returns information about one or more collections.\nExamples # The following request gets all the collections under the default namespace:\nGET /default:* Request # GET /\u0026lt;target\u0026gt; Path Parameters # target\n(Required, String) Comma-separated, names of the collections to get (wildcard supported) "});index.add({'id':28,'href':'/docs/references/collection/get_index/','title':"Get collection index",'section':"Collection",'content':"Get collection index # See the following documents:\n Get index Get index alias Get index mapping Get index settings "});index.add({'id':29,'href':'/docs/references/collection/get_schema/','title':"Get collection schema",'section':"Collection",'content':"Get collection schema # Returns schema information about one or more collections.\nExamples # The following request gets the schema information of all the collections under the default namespace:\nGET /default:*/_schema Retrieve the schema information of all the collections:\nGET /_schema Request # GET /\u0026lt;target\u0026gt;/_schema Path Parameters # target\n(Optional, String) Comma-separated, names of the collections to get (wildcard supported) "});index.add({'id':30,'href':'/docs/references/collection/get_settings/','title':"Get collection settings",'section':"Collection",'content':"Get collection settings # Returns settings information about one or more collections.\nExamples # The following request gets the settings information of all the collections under the default namespace:\nGET /default:*/_settings Retrieve the settings information of all the collections:\nGET /_settings Request # GET /\u0026lt;target\u0026gt;/_settings Path Parameters # target\n(Optional, String) Comma-separated, names of the collections to get (wildcard supported) "});index.add({'id':31,'href':'/docs/references/index/get/','title':"Get index",'section':"Index",'content':"Get index # Returns information about one or more indices\nExamples # Get the information of all indices:\nGET /_index Get the information of the index named my-index under collection my-collection:\nGET /my-collection/_index/my-index Request # GET /_index GET /\u0026lt;target\u0026gt;/_index GET /\u0026lt;target\u0026gt;/_index/\u0026lt;index\u0026gt; Path Parameters # target\n(Required, String) Comma-separated, names of the collections to specify (wildcard supported)\n index\n(Required, String) Comma-separated, names of the indices to get (wildcard supported)\n "});index.add({'id':32,'href':'/docs/references/index/get_alias/','title':"Get index alias",'section':"Index",'content':"Get index alias # Returns alias information about one or more indices under the specified collection.\nExamples # Get the alias information of the my-index index under collection my-collection:\nGET /my-collection/_index/my-index/_alias Request # GET /\u0026lt;target\u0026gt;/_index/\u0026lt;index\u0026gt;/_alias GET /\u0026lt;target\u0026gt;/_index/_alias Path Parameters # target\n(Required, String) Comma-separated, names of the collections to specify (wildcard supported)\n index\n(Optional, String) Comma-separated, names of the indices to get (wildcard supported)\n "});index.add({'id':33,'href':'/docs/references/index/get_mapping/','title':"Get index mapping",'section':"Index",'content':"Get index mapping # Returns mapping information about one or more indices under the specified collection.\nExamples # Get the mapping information of the my-index index under collection my-collection:\nGET /my-collection/_index/my-index/_mapping Request # GET /\u0026lt;target\u0026gt;/_index/\u0026lt;index\u0026gt;/_mapping GET /\u0026lt;target\u0026gt;/_index/_mapping Path Parameters # target\n(Required, String) Comma-separated, names of the collections to specify (wildcard supported)\n index\n(Optional, String) Comma-separated, names of the indices to get (wildcard supported)\n "});index.add({'id':34,'href':'/docs/references/index/get_settings/','title':"Get index setting",'section':"Index",'content':"Get index setting # Returns setting information about one or more indices under the specified collection.\nExamples # Get the setting information of the my-index index under collection my-collection:\nGET /my-collection/_index/my-index/_setting Request # GET /\u0026lt;target\u0026gt;/_index/\u0026lt;index\u0026gt;/_setting GET /\u0026lt;target\u0026gt;/_index/_setting Path Parameters # target\n(Required, String) Comma-separated, names of the collections to specify (wildcard supported)\n index\n(Optional, String) Comma-separated, names of the indices to get (wildcard supported)\n "});index.add({'id':35,'href':'/docs/references/index/','title':"Index",'section':"References",'content':"Index # An index is a specialized data structure designed to improve search performance by efficiently organizing and storing data for quick retrieval. Each collection has a default index called default. You can create additional indices, each tailored to specific use cases to enhance search speed.\nAn index can manage various aspects such as mappings, settings, and aliases:\n Mappings: Define the schema of the index layout, specifying how fields in the documents are indexed and stored. This includes data types and analyzers used for text fields.\n Settings: Configure the behavior and properties of the index, such as the number of replicas, whether to support realtime indexing or update in place, etc.\n Aliases: Provide a layer of abstraction, allowing multiple indices to be referred to by a single name. This is useful for operations like reindexing, enabling zero-downtime reindexing by switching aliases.\n These components and configurations ensure that an index can be precisely tuned to optimize search performance and accommodate evolving requirements.\nIndex management # Index APIs are used to manage individual indices and settings.\n Create an index Delete an index Get index Get index alias Get index mapping Get index setting "});index.add({'id':36,'href':'/docs/references/document/','title':"Document",'section':"References",'content':"Document # In Pizza, a document is a data structure composed of field-and-value pairs. It\u0026rsquo;s roughly equivalent to a row in a relational database table, but with a dynamic schema. Documents are the basic unit of data storage in Pizza, and collections are groupings of documents.\nEach document in Pizza is represented in JSON format, which is a lightweight data interchange format that is easy for humans to read and write and easy for machines to parse and generate.\nDocument management # Document APIs are used to manage documents.\n Create a document Fetch a document Replace a document Partial update a document Delete a document Batch document operation "});index.add({'id':37,'href':'/docs/getting-started/cli/','title':"Pizza CLI",'section':"Getting started",'content':"Pizza CLI # The Pizza Command Line Interface (CLI) is a tool designed to facilitate quick and interactive communication with the Pizza server. It provides a convenient way for users to perform various tasks, such as querying data, managing configurations, and monitoring system status, directly from the command line.\n Features # Interactive Querying # The Pizza CLI allows users to execute queries against the Pizza server interactively. Users can enter commands and receive immediate feedback, enabling rapid exploration and analysis of data.\nConfiguration Management # With the Pizza CLI, users can manage Pizza server configurations effortlessly. They can adjust settings, update parameters, and modify configurations on the fly, all from the command line interface.\nSystem Monitoring # The Pizza CLI provides real-time monitoring capabilities, allowing users to track system performance, monitor resource usage, and identify potential bottlenecks or issues promptly.\nUsage # To use the Pizza CLI, simply launch the command line interface and enter the desired commands. The CLI provides intuitive prompts and options to guide users through various operations.\nOptions # Start with your Pizza endpoint: ./cli http://localhost:9100/\n"});index.add({'id':38,'href':'/docs/overview/architecture/','title':"Architecture",'section':"Overview",'content':"Architecture # Share-Nothing and Asynchronous I/O in Pizza # Pizza is built upon a robust share-nothing architecture, ensuring complete isolation of resources at both the node and per-CPU level. Each CPU core and associated threads operate independently, without sharing memory or resources with other cores or nodes. Additionally, Pizza embraces a fully asynchronous manner to access I/O and network resources, leveraging technologies like io_uring for efficient I/O operations.\n [Pizza Architecture] Why Share-Nothing and Asynchronous I/O? # The combination of share-nothing architecture and asynchronous I/O enables Pizza to seamlessly scale across large-scale datasets and high-throughput workloads, along with optimal performance and high resource utilization.\nMulti-Core Trending and Hardware Considerations # As hardware trends towards increasing numbers of cores per machine, share-nothing architectures become increasingly relevant and advantageous. With machines featuring thousands of cores becoming more common, share-nothing architectures enable efficient utilization of parallelism without encountering bottlenecks associated with shared resources.\nContention and Locking Issues # In share-everything architectures, contention for shared resources and locking mechanisms can become significant bottlenecks, especially in highly parallel environments. Share-nothing architectures eliminate these contention points by ensuring each node operates independently, avoiding the need for centralized locking mechanisms and reducing contention-related performance degradation.\nFault Isolation and Resilience # Pizza\u0026rsquo;s share-nothing architecture enhances fault isolation and system resilience. Each CPU core and thread operates autonomously, minimizing the impact of failures or performance degradation on other cores or nodes. Similarly, asynchronous I/O operations isolate I/O-related failures, ensuring that failures in one operation do not affect the execution of others.\nNUMA and Cache-Friendly Design # Pizza is designed to be NUMA-friendly and local cache or memory access-friendly. By minimizing memory access latency and optimizing performance on NUMA architectures, Pizza ensures efficient memory access and optimal performance. Additionally, its cache-friendly design eliminates the need to access remote memory in other CPU\u0026rsquo;s address spaces, reducing cache coherency overhead and improving overall performance.\nAsynchronous I/O # Pizza employs asynchronous I/O and io_uring for efficient, non-blocking I/O operations, enhancing performance and scalability. Unlike traditional synchronous I/O, which involves blocking system calls, asynchronous I/O enables Pizza to reduce latency and improve resource utilization, especially in I/O-bound scenarios. With io_uring, a high-performance asynchronous I/O framework in the Linux kernel, Pizza minimizes system call overhead and optimizes buffer management. This approach delivers improved performance, scalability, resource efficiency, and a better user experience.\n"});index.add({'id':39,'href':'/docs/references/','title':"References",'section':"Documentation",'content':"References # Data management # Namespace APIs Collection APIs Document APIs Index APIs Search and analyze # Search you data Aggregations "});index.add({'id':40,'href':'/docs/administration/','title':"Administration",'section':"Documentation",'content':"Administration # Cluster management # Search and analyze # "});index.add({'id':41,'href':'/docs/overview/sharding/','title':"Why named Pizza",'section':"Overview",'content':"Why named Pizza? # Do you wonder why this project is named Pizza?\nTo infinity scaling # Pizza solves the challenge of managing massive data seamlessly. Imagine creating a collection and continuously adding documents, from zero to petabytes, without the need to worry about sharding or reindexing. Scaling your machine becomes effortless, ensuring a smooth, seamless, and painless experience for application developers.\nSharding puzzle # One of the world\u0026rsquo;s three major challenges: What is the appropriate size for an index shard?\n [Sharding Puzzle!] Shards are like cars that transport your data, but determining how many shards you need is challenging because the amount of data is unpredictable and could continuously grow.\n [Sizing Puzzle!] Traditional sharding methods have several shortcomings. Currently, distributed system storage partitioning methods mainly include:\n Range-based partitioning methods require data to have a high dispersion in value ranges. Fixed-factor hash partitioning methods, set at database creation, may lead to over-allocation of resources if the partition factor is too large or performance issues if it\u0026rsquo;s too small. Consistent hashing algorithms lack adaptability to heterogeneous systems and flexibility in data partitioning, resulting in complex operations and suboptimal resource utilization. Is there any other approach?\nPizza\u0026rsquo;s design # Pizza does things differently!\nStart with document ID # Pizza facilitates updates, and it\u0026rsquo;s top-notch. Ensuring efficient updates requires a unique identity for each document. While accommodating a vast dataset beyond trillions of documents, one can opt for a string-based UUID or utilize a uint64 or uint128 assigned to each document. However, utilizing a wide-sized primary key may lead to resource wastage, unnecessary compression, or conversion.\nIn Pizza, document identification follows a two-dimensional approach. Each document is assigned a unique identity comprising the rolling ID (rolling_id) and the internal asigned ID (seq_doc_id) within this rolling. These IDs are structured for efficiency, incorporating partition positions for rapid data localization. Rolling IDs and assigned document IDs increment automatically. Document value ranges vary based on numeric types chosen, accommodating trillions of documents. User-defined IDs seamlessly map to unique IDs.\nWith this design:\n Assigned ID format: [rolling_id], [seq_doc_id]. Assigned IDs adopt a composite two-dimensional structure. Assigned IDs are compact numeric types designed to support massive datasets. Assigned IDs are self-descriptive, include partition positions as routing for rapid data access. Rolling ID serves as metadata-level description and does not require persistence with each record. The space allocated for sequence assigned IDs is quite compact and compression friendly. Assigned ID becomes globally identity is also good fit to support frequently updates. Within a single collection, Pizza offers varying document value ranges to accommodate different scale requirements, For [UInt8, UInt32] the capability estimated as rolling_id ranges from 0 to 255 for UInt8 types and seq_doc_id range from 0 to 4,294,967,295 for UInt32 types. Which can sustain [0 : 1,095,216,660,225] documents. if we scale rolling_id to use UInt16, then it will be [0 : 65,535], [0 : 4,294,967,295] = [0 : 281,470,681,677,825], which means 281 trillions scale, should be good start for any case.\nThese capabilities enable Pizza to handle collections of varying sizes, from smaller-scale to trillions of documents, efficiently and smoothly scale on demand.\nUser-defined IDs # \u0026ldquo;But my IDs was shipped from external database\u0026rdquo;\nThat\u0026rsquo;s fair, Pizza handle this simply do map the UUID to an unique assigned document ID.\nFor example with this document creation:\nPOST /my-collection/_doc/myid { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infini\u0026#34; } } You will get:\n{ \u0026#34;_key\u0026#34;: \u0026#34;myid\u0026#34;, \u0026#34;_id\u0026#34;: \u0026#34;0,123\u0026#34;, \u0026#34;_version\u0026#34;: 1, \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;my-collection\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;created\u0026#34;, ... } The _id valued 0,123 means a unique Pizza document ID was assigned to myid within this collection.\nInstead of passing the UUID throughout the further process, it\u0026rsquo;s common to begin with a search, and you will get a search result. The document in the search result should contain both _id and _key. pass both of them as the document identity should work.\nJust like a Pizza # As you\u0026rsquo;ve noted, the maximum number of documents in a single rolling is a fixed size of 4,294,967,295, typically suitable for smaller use cases. The fixed capacity of single rolling is not a bug, it is a feature!\nThink of the rolling as the iron plate used for cooking the Pizza, and sending data to the rolling is akin to adding your delicious ingredients to that iron plate.\nWe ingest the data, we enjoy the pizza, just like that!\n [Yummy Pizza!] More Pizza - Rolling # So you have number of documents beyond 4,294,967,295?\nNo worries, Let\u0026rsquo;s roll to another rolling, more rolling, more pizza, the party won\u0026rsquo;t stop tonight.\n [More Pizza!] When the capacity of a rolling is exceeded, data is automatically switched to the next rolling for continued writing.\nRollings can grow infinitely to meet ongoing growth requirements.\nPackaged Pizza # There are many benefits fo package data like Pizza:\n Each rolling has a fixed size for ease of distribution and physical resource management. The number and size of shards are predictable. Shards are generated on demand, eliminating the need for advance planning. Scalability is infinite, allowing for horizontal expansion. Stable and predictable read/write performance. [Packaged Pizza!] Slicing with partition # Wait, 4.2 billion is not a small number, it may choke anyway, so we introduce partitions within rolling, just like we share slices of pizza to friends.\n [Slicing Pizza!] A single rolling can be split into a maximum of 256 physical partitions by default (configurable at creation). A lookup table is used to maintain the relationship between logical partitions and physical shards.\n [Slicing Pizza!] Physical shards and logical partitions can be dynamically split or merged. In scenarios with low write pressure, all partition data within a single shard is consolidated, appearing as a single data directory physically.\nHash based routing # How about if we have more than one rolling, how do I know which rolling contains these UUID?\nCustom or assigned IDs are hashed based to establish a one-to-one relationship with partitions:\n[HASH(KEY) or ID] % 256 = PARTITION_ID\n [Routing Slices!] The worst case is the request need to revisit same partition across all rollings, but the scope is limited as stepped with 4.2b, also we can have UUID mapping cache ahead.\nAlways better to use Pizza assigned _id rather the _key for mutation, as _rolling_id is part of _id, so Pizza know which _rolling need to talk without ask.\nUltimate scaling # Lastly, we will talk about replica, each shard can have replicas, to scale out for more search throughput.\n [Sharding Architecture] Rolling, Partition, Replica - three dimensions for ultimate scaling.\nThe more data you feed in, the more pizza you cook. Enjoy your yummy data!\n"});index.add({'id':42,'href':'/docs/overview/realtime/','title':"How realtime works",'section':"Overview",'content':"How realtime works? # Do you like to let your customer wait? # [Pizza in Realtime] "});index.add({'id':43,'href':'/docs/release-notes/','title':"Release notes",'section':"Documentation",'content':"Release notes # Information about release notes of INFINI Pizza is provided here.\n0.1.0 # Breaking changes # Features # Bug fix # Improvements # "});index.add({'id':44,'href':'/docs/references/document/fetch/','title':"Fetch a document",'section':"Document",'content':"Fetch a document # Retrieve an existing document by specifying its unique identifier.\nExamples # Fetch a document from the my-collection collection with customized uuid news_001:\nGET /my-collection/_doc/news_001 The API returns the following result:\n{ \u0026#34;_id\u0026#34;: \u0026#34;0,0\u0026#34;, \u0026#34;_version\u0026#34;: 1, \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;my-collection\u0026#34;, \u0026#34;_key\u0026#34; : \u0026#34;news_001\u0026#34;, \u0026#34;found\u0026#34;: true, \u0026#34;_source\u0026#34; : { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infini\u0026#34; } } } As you can see, the customized uuid are represented as _key within the document, and there is also a _id returned with value 0,0, this is the internal id generated by Pizza, and it is guaranteed to be unique, so you can also fetch this document by this value like this:\nGET /my-collection/_doc/0,0 Request # POST /\u0026lt;target\u0026gt;/_doc/\u0026lt;doc_id\u0026gt; Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Required, string) The unique identify of this document, support both _key or _id. "});index.add({'id':45,'href':'/docs/references/document/replace/','title':"Replace a document",'section':"Document",'content':"Replace a document # Replace an existing document by specifying its unique identifier and the new content.\nExamples # Replace a document news_001 of the collection my-collection with new content:\nPUT /my-collection/_doc/news_001 { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infinilabs\u0026#34; } } The API returns as following result:\n{\u0026#34;_id\u0026#34;:\u0026#34;0,0\u0026#34;, \u0026#34;_key\u0026#34;:\u0026#34;news_001\u0026#34;, \u0026#34;result\u0026#34;:\u0026#34;updated\u0026#34;} After the document modification, If you perform the fetch request:\nGET /my-collection/_doc/news_001 It returns an updated document like:\n{ \u0026#34;_id\u0026#34;: \u0026#34;0,0\u0026#34;, \u0026#34;_version\u0026#34;: 2, \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;my-collection\u0026#34;, \u0026#34;_key\u0026#34; : \u0026#34;news_001\u0026#34;, \u0026#34;found\u0026#34;: true, \u0026#34;_source\u0026#34; : { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infinilabs\u0026#34; } } } Note that the document _version was increased to 2.\nPizza works by marking the old document as deleted and insert a new document under the hood.\nRequest # POST /\u0026lt;target\u0026gt;/_doc/\u0026lt;doc_id\u0026gt; {\u0026lt;fields\u0026gt;} Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Required, string) The unique identify of this document, support both _key or _id. Request body # \u0026lt;fields\u0026gt;\n(Required, string) Request body contains the JSON source for the document data. "});index.add({'id':46,'href':'/docs/references/document/partial_update/','title':"Partial update a document",'section':"Document",'content':"Partial update a document # Sometimes we may only need to update a portion fields of the document.\nExamples # Update the org.id field of the document news_001 in the collection my-collection:\nPUT /my-collection/_doc/news_001/_update { \u0026#34;sync\u0026#34;:{ \u0026#34;replace\u0026#34;:{ \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infinilabs\u0026#34; } } } } The API returns as following result:\n{\u0026#34;_id\u0026#34;:\u0026#34;0,0\u0026#34;, \u0026#34;_key\u0026#34;:\u0026#34;news_001\u0026#34;, \u0026#34;result\u0026#34;:\u0026#34;updated\u0026#34;} Pizza using the method of fetching a document, then merging partial updates and replacing it.\nRequest # POST /\u0026lt;target\u0026gt;/_doc/\u0026lt;doc_id\u0026gt;/_update { \u0026#34;sync\u0026#34;:{ \u0026lt;operation\u0026gt;: {\u0026lt;fields\u0026gt;} } \u0026#34;async\u0026#34;:{ \u0026lt;operation\u0026gt;: {\u0026lt;fields\u0026gt;} } } Pizza support both sync and async way to perform the updates, in order to update in realtime, you need to use sync here.\nIn asynchronous mode, the update process is considered complete once the request is committed to the WAL. Background tasks independently consume and process updates asynchronously, making it suitable for scenarios prioritizing update efficiency.\nPath parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Required, string) The unique identify of this document, support both _key or _id. Request body # \u0026lt;operation\u0026gt; The operation supported by partial updates: add, replace, remove, array_append. \u0026lt;fields\u0026gt;\n(Required, string) The JSON format of the fields operation by partial updates. "});index.add({'id':47,'href':'/docs/references/document/delete/','title':"Delete a document",'section':"Document",'content':"Delete a document # Delete a specific document from the specified collection by specifying its unique identifier.\nExamples # Delete the document 0,0 from collection my-collection:\nDELETE /my-collection/_doc/0,0 The API returns the following result:\n{ \u0026#34;_id\u0026#34;: \u0026#34;0,0\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;deleted\u0026#34;, ... } Request # DELETE /\u0026lt;target\u0026gt;/_doc/\u0026lt;doc_id\u0026gt; Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Required, string) Unique identifier for the document, support both _key or _id. "});index.add({'id':48,'href':'/docs/references/document/bulk/','title':"Batch document operation",'section':"Document",'content':"Batch document operation # Provides a efficient way to perform multiple index, create, delete, and update operations in a single request.\nExamples # POST /_bulk { \u0026#34;index\u0026#34; : { \u0026#34;_index\u0026#34; : \u0026#34;test\u0026#34;, \u0026#34;_id\u0026#34; : \u0026#34;1\u0026#34; } } { \u0026#34;field1\u0026#34; : \u0026#34;value1\u0026#34; } { \u0026#34;delete\u0026#34; : { \u0026#34;_index\u0026#34; : \u0026#34;test\u0026#34;, \u0026#34;_id\u0026#34; : \u0026#34;2\u0026#34; } } { \u0026#34;create\u0026#34; : { \u0026#34;_index\u0026#34; : \u0026#34;test\u0026#34;, \u0026#34;_id\u0026#34; : \u0026#34;3\u0026#34; } } { \u0026#34;field1\u0026#34; : \u0026#34;value3\u0026#34; } { \u0026#34;update\u0026#34; : {\u0026#34;_id\u0026#34; : \u0026#34;1\u0026#34;, \u0026#34;_index\u0026#34; : \u0026#34;test\u0026#34;} } { \u0026#34;doc\u0026#34; : {\u0026#34;field2\u0026#34; : \u0026#34;value2\u0026#34;} } The API returns the following result:\n{ \u0026#34;took\u0026#34;: 30, \u0026#34;errors\u0026#34;: false, \u0026#34;items\u0026#34;: [ { \u0026#34;index\u0026#34;: { \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;test\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;created\u0026#34;, ... } }, { \u0026#34;delete\u0026#34;: { \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;test\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;not_found\u0026#34;, ... } }, { \u0026#34;create\u0026#34;: { \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;test\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;created\u0026#34;, ... } }, { \u0026#34;update\u0026#34;: { \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;test\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;updated\u0026#34;, ... } } ] } Request # POST /_bulk POST /\u0026lt;target\u0026gt;/_bulk Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. Request body # The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:\naction_and_meta_data\\n optional_source\\n action_and_meta_data\\n optional_source\\n .... action_and_meta_data\\n optional_source\\n The index and create actions expect a source on the next line, and have the same semantics as the standard API: create fails if a document with the same ID already exists in the target, index adds or replaces a document as necessary.\nupdate expects that the partial doc, upsert, and script and its options are specified on the next line.\ndelete does not expect a source on the next line and has the same semantics as the standard delete API.\nBecause this format uses literal \\n\u0026rsquo;s as delimiters, make sure that the JSON actions and sources are not pretty printed.\nIf you provide a \u0026lt;target\u0026gt; in the request path, it is used for any actions that don’t explicitly specify an _index argument.\ncreate # Indexes the specified document if it does not already exist. The following line must contain the source data to be indexed.\n _namespace\n(Optional, string) Name of the namespace to perform the action on. _collection\n(Optional, string) Name of the collection to perform the action on. This parameter is required if a \u0026lt;target\u0026gt; is not specified in the request path. _index\n(Optional, string) A shortcut to specify the namespace and collection in [\u0026lt;namespace\u0026gt;:]\u0026lt;collection\u0026gt; syntax. This parameter conflicts with \u0026lt;_namespace\u0026gt; and \u0026lt;_collection\u0026gt;. _id\n(Optional, string) The document ID. If no ID is specified, a document ID is automatically generated. delete # Removes the specified document from the index.\n _namespace\n(Optional, string) Name of the namespace to perform the action on. _collection\n(Optional, string) Name of the collection to perform the action on. This parameter is required if a \u0026lt;target\u0026gt; is not specified in the request path. _index\n(Optional, string) A shortcut to specify the namespace and collection in [\u0026lt;namespace\u0026gt;:]\u0026lt;collection\u0026gt; syntax. This parameter conflicts with \u0026lt;_namespace\u0026gt; and \u0026lt;_collection\u0026gt;. _id\n(Required, string) The document ID. If no ID is specified, a document ID is automatically generated. index # Indexes the specified document. If the document exists, replaces the document and increments the version. The following line must contain the source data to be indexed.\n _namespace\n(Optional, string) Name of the namespace to perform the action on. _collection\n(Optional, string) Name of the collection to perform the action on. This parameter is required if a \u0026lt;target\u0026gt; is not specified in the request path. _index\n(Optional, string) A shortcut to specify the namespace and collection in [\u0026lt;namespace\u0026gt;:]\u0026lt;collection\u0026gt; syntax. This parameter conflicts with \u0026lt;_namespace\u0026gt; and \u0026lt;_collection\u0026gt;. _id\n(Optional, string) The document ID. If no ID is specified, a document ID is automatically generated. delete # Removes the specified document from the index.\n _namespace\n(Optional, string) Name of the namespace to perform the action on. _collection\n(Optional, string) Name of the collection to perform the action on. This parameter is required if a \u0026lt;target\u0026gt; is not specified in the request path. _index\n(Optional, string) A shortcut to specify the namespace and collection in [\u0026lt;namespace\u0026gt;:]\u0026lt;collection\u0026gt; syntax. This parameter conflicts with \u0026lt;_namespace\u0026gt; and \u0026lt;_collection\u0026gt;. _id\n(Required, string) The document ID. If no ID is specified, a document ID is automatically generated. doc # The partial document to index. Required for update operations.\n\u0026lt;fields\u0026gt; # The document source to index. Required for create and index operations.\n"});index.add({'id':49,'href':'/docs/references/search/','title':"Search",'section':"References",'content':"Search # A search query, or query, is a request for information about documents in Pizza collections.\nA search consists of one or more queries that are combined and sent to Pizza. Documents that match a search\u0026rsquo;s queries are returned in the hits, or search results, of the response.\nA search may also contain additional information used to better process its queries. For example, a search may be limited to a specific collection or only return a specific number of results.\nExamples # Search all the collections under the default namespace whose names are ended with -logs, fetch the documents whose field year has value 2024:\nPOST /default.*-logs/_search { \u0026quot;query\u0026quot;: { \u0026quot;term\u0026quot;: { \u0026quot;year\u0026quot;: \u0026quot;2024\u0026quot; } } } Requests # POST /\u0026lt;targets\u0026gt;/_search Path parameters # targets\n(Optional, String) Comma-separated, names of the collection to search (wildcard supported) Query parameters # from\n(Optional, integer) How many documents to skip, should be non-negative and defaults to 0.\n size\n(Optional, integer) The maximun number of documents to be returned in hits, defaults to 20.\n Term-level queries # prefix query\nReturns documents that contain a specific prefix in a provided field. range query\nReturns documents that contain terms within a provided range. regexp query\nReturns documents that contain terms matching a regular expression. term query\nReturns documents that contain an exact term in a provided field. "});index.add({'id':50,'href':'/docs/references/aggregation/','title':"Aggregation",'section':"References",'content':"Aggregation # An aggregation summarizes your data as metrics, statistics, or other analytics.\nPizza organizes aggregations into the following categories:\n Metric aggregations that calculate metrics, such as a sum or average, from field values. Bucket aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria. Metric aggregations # avg aggregation\nA single-value metrics aggregation that computes the average of numeric values that are extracted from the aggregated documents. max aggregation\nA single-value metrics aggregation that keeps track and returns the maximum value among the numeric values extracted from the aggregated documents. min aggregation\nA single-value metrics aggregation that keeps track and returns the minimum value among numeric values extracted from the aggregated documents. percentiles aggregation\nA multi-value metrics aggregation that calculates one or more percentiles over numeric values extracted from the aggregated documents. sum aggregation\nA single-value metrics aggregation that sums up numeric values that are extracted from the aggregated documents. value_count aggregation\nA single-value metrics aggregation that counts the number of values that are extracted from the aggregated documents. Bucket aggregations # date_histogram aggregation\nA histogram aggregation that can only be used with date or date range values. terms aggregation\nA multi-bucket value source based aggregation where buckets are dynamically built - one per unique value. "});index.add({'id':51,'href':'/docs/community/','title':"Community",'section':"Documentation",'content':"Community hall of fame # The following acknowledges the Maintainers for the Pizza project, credits to those who have Contributed to this repository (via bug reports, code, design, ideas, project management, translation, testing, etc.), proactive advocates of pizza as Evangelists, and any other References utilized.\nMaintainers # Medcl, SteveLauC\nCommitters # Contributors # Loi Chyan, Hanming\nThanks to all the CONTRIBUTORS making their effort to help Pizza getting better.\nEvangelists # Adopters # Pizza community of adopters is growing! Innovative organizations of all sizes and across industry sectors are committed to accelerating the adoption of commercial-grade, production-ready open source technologies developed by the Pizza community.\nDo you use INFINI Pizza? Show your support for open source by adding your logo to this page.\nPlease create an issue to add your logo below.\nReferences # "});})();