Nagiza Faridovna Samatova

Works (99)

Updated: April 5th, 2024 05:28

2021 article

Predictive models with end user preference

Zhao, Y., Yang, X., Bolnykh, C., Harenberg, S., Korchiev, N., Yerramsetty, S. R., … Samatova, N. F. (2021, August 26). STATISTICAL ANALYSIS AND DATA MINING, Vol. 8.

By: Y. Zhao n, X. Yang n, C. Bolnykh n, S. Harenberg*, N. Korchiev n, S. Yerramsetty*, B. Vellanki, R. Kodumagulla, N. Samatova n

author keywords: child support; decision tree; predictive model; regularization; relative ranking; user preference
TL;DR: A generic modeling method that respects end user preferences via a relative ranking system to express multi‐criteria preferences and a regularization term in the model's objective function to incorporate the ranked preferences is proposed. (via Semantic Scholar)
UN Sustainable Development Goal Categories
10. Reduced Inequalities (OpenAlex)
Sources: Web Of Science, NC State University Libraries
Added: September 7, 2021

2020 journal article

Group classification of the two-dimensional shallow water equations with the beta-plane approximation of coriolis parameter in Lagrangian coordinates

COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION, 90.

By: S. Meleshko* & N. Samatova n

author keywords: Lagrangian coordinates; Shallow water equations; Uneven bottom; Admitted lie group
TL;DR: The paper provides a complete group classification of the equations and applications of Noether’s theorem for constructing conservation laws in two-dimensional shallow water equations. (via Semantic Scholar)
UN Sustainable Development Goal Categories
14. Life Below Water (Web of Science)
15. Life on Land (OpenAlex)
Source: Web Of Science
Added: September 21, 2020

2020 journal article

The one-dimensional Green-Naghdi equations with a time dependent bottom topography and their conservation laws

PHYSICS OF FLUIDS, 32(12).

By: E. Kaptsov, S. Meleshko n & N. Samatova n

UN Sustainable Development Goal Categories
14. Life Below Water (Web of Science)
15. Life on Land (OpenAlex)
Source: Web Of Science
Added: January 11, 2021

2018 article

An Intelligent and Hybrid Weighted Fuzzy Time Series Model Based on Empirical Mode Decomposition for Financial Markets Forecasting

ADVANCES IN DATA MINING: APPLICATIONS AND THEORETICAL ASPECTS (ICDM 2018), Vol. 10933, pp. 104–118.

By: R. Yang n, J. He*, M. Xu n, H. Ni n, P. Jones n & N. Samatova*

Contributors: R. Yang n, J. He*, M. Xu n, H. Ni n, P. Jones n & N. Samatova*

author keywords: EMD; Weighted fuzzy time series; Human learning optimization algorithm; Financial markets forecasting
TL;DR: A new Intelligent Hybrid Weighted Fuzzy (IHWF) time series model to improve forecasting accuracy in financial markets, which are complex nonlinear time-sensitive systems, influenced by many factors. (via Semantic Scholar)
Sources: Web Of Science, ORCID
Added: June 17, 2019

2018 article

Mining Aspect-Specific Opinions from Online Reviews Using a Latent Embedding Structured Topic Model

COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, CICLING 2017, PT II, Vol. 10762, pp. 195–210.

By: M. Xu n, R. Yang n, P. Jones n & N. Samatova n

Contributors: M. Xu n, R. Yang n, P. Jones n & N. Samatova n

TL;DR: This paper proposes a Latent embedding structured Opinion mining Topic model, called the LOT, which can simultaneously discover relevant aspect-level specific opinions from small or large numbers of reviews and to assign accurate sentiment to words. (via Semantic Scholar)
UN Sustainable Development Goal Categories
1. No Poverty (OpenAlex)
Sources: Web Of Science, ORCID
Added: January 28, 2019

2018 journal article

Sex Differences in Cognitive Decline in Subjects with High Likelihood of Mild Cognitive Impairment due to Alzheimer's disease

SCIENTIFIC REPORTS, 8.

By: D. Sohn n, K. Shpanskaya*, J. Lucas*, J. Petrella*, A. Saykin*, R. Tanzi*, N. Samatova*, P. Doraiswamy*

MeSH headings : Aged; Aged, 80 and over; Alzheimer Disease / diagnosis; Alzheimer Disease / epidemiology; Alzheimer Disease / etiology; Alzheimer Disease / pathology; Cognition / physiology; Cognitive Dysfunction / diagnosis; Cognitive Dysfunction / epidemiology; Cognitive Dysfunction / etiology; Disease Progression; Female; Humans; Longitudinal Studies; Male; Neuroimaging; Neuropsychological Tests; Prevalence; Retrospective Studies; Risk Factors; Sex Characteristics
TL;DR: The findings highlight the need to further investigate these findings in other populations and develop sex specific timelines for Alzheimer’s disease progression and further investigate the effect of sex on cognitive progression in subjects with high likelihood of mild cognitive impairment. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2017 chapter

A Community-Driven Graph Partitioning Method for Constraint-Based Causal Discovery

In C. Cherifi, H. Cherifi, M. Karsai, & M. Musolesi (Eds.), Complex Networks & Their Applications VI. COMPLEX NETWORKS 2017 (pp. 253–264).

By: M. Chaudhary n, S. Ranshous n & N. Samatova n

Ed(s): C. Cherifi, H. Cherifi, M. Karsai & M. Musolesi

TL;DR: This work proposes a novel recursive algorithm for constructing causal graphs, based on a two-phase divide and conquer strategy, which effectively finds the d-separators, leading to a significant improvement in the quality of causal graphs. (via Semantic Scholar)
Source: Crossref
Added: December 11, 2020

2017 article

A Lifelong Learning Topic Model Structured Using Latent Embeddings

2017 11TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), pp. 260–261.

By: M. Xu n, R. Yang n, S. Harenberg n & N. Samatova n

Contributors: M. Xu n, R. Yang n, S. Harenberg n & N. Samatova n

author keywords: Lifelong learning; Topic modeling; Latent embeddings
TL;DR: A latent-embedding-structured lifelong learning topic model, called the LLT model, to discover coherent topics from a corpus and exploit latent word embeddings to structure the model and mine word correlation knowledge to assist in topic modeling. (via Semantic Scholar)
Sources: Web Of Science, ORCID
Added: August 6, 2018

2017 article

An Intelligent Weighted Fuzzy Time Series Model Based on a Sine-Cosine Adaptive Human Learning Optimization Algorithm and Its Application to Financial Markets Forecasting

ADVANCED DATA MINING AND APPLICATIONS, ADMA 2017, Vol. 10604, pp. 595–607.

By: R. Yang n, M. Xu n, J. He*, S. Ranshous n & N. Samatova*

Contributors: R. Yang n, M. Xu n, J. He*, S. Ranshous n & N. Samatova*

author keywords: Weighted fuzzy time series; Human learning optimization algorithm; Financial markets forecasting
TL;DR: An intelligent weighted fuzzy time series model for financial forecasting, which uses a sine-cosine adaptive human learning optimization (SCHLO) algorithm to search for the optimal parameters for forecasting, is presented. (via Semantic Scholar)
Sources: Web Of Science, ORCID
Added: November 26, 2018

2017 chapter

Efficient Outlier Detection in Hyperedge Streams Using MinHash and Locality-Sensitive Hashing

In C. Cherifi, H. Cherifi, M. Karsai, & M. Musolesi (Eds.), Complex Networks & Their Applications VI. COMPLEX NETWORKS 2017 (pp. 105–116).

By: S. Ranshous n, M. Chaudhary n & N. Samatova n

Ed(s): C. Cherifi, H. Cherifi, M. Karsai & M. Musolesi

TL;DR: This work proposes the first approach for mining outliers in hyperedge streams, and describes an approximation scheme that ensures the model is suitable for being run in streaming environments. (via Semantic Scholar)
Source: Crossref
Added: December 11, 2020

2017 chapter

Exchange Pattern Mining in the Bitcoin Transaction Directed Hypergraph

In Financial Cryptography and Data Security (pp. 248–263).

By: S. Ranshous n, C. Joslyn*, S. Kreyling*, K. Nowak*, N. Samatova*, C. West*, S. Winters*

TL;DR: This work identifies distinct statistical properties of exchange addresses related to the acquisition and spending of bitcoin, and builds classification models to learn a set of discriminating features and predicts if an address is owned by an exchange with \(>80\%\) accuracy using purely structural features of the graph. (via Semantic Scholar)
Source: Crossref
Added: February 24, 2020

2017 conference paper

Leveraging External Knowledge for Phrase-Based Topic Modeling

Proceedings - 2017 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2017, 29–32.

By: M. Xu n, R. Yang n, S. Ranshous n, S. Li n & N. Samatova n

Contributors: M. Xu n, R. Yang n, S. Ranshous n, S. Li n & N. Samatova n

TL;DR: Experimental results show that the proposed knowledge-based topic model outperforms the state-of-the-art baseline on both small and large datasets, extracting more meaningful phrases and coherent topics. (via Semantic Scholar)
Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2017 conference paper

Real time utility-based recommendation for revenue optimization via an adaptive online Top-K high utility itemsets mining model

ICNC-FSKD 2017 - 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, 1859–1866.

By: R. Yang n, M. Xu n, P. Jones n & N. Samatova*

Contributors: R. Yang n, M. Xu n, P. Jones n & N. Samatova*

TL;DR: This work considers that online transaction streams are usually accompanied with flow fluctuation, and proposes an Adaptive Online Top-K (RAOTK) high utility itemsets mining model to guide the utility-based recommendations. (via Semantic Scholar)
Sources: NC State University Libraries, ORCID
Added: August 6, 2018

2017 journal article

Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 29(10), 2318–2331.

author keywords: Data science; knowledge discovery; domain knowledge; scientific theory; physical consistency; interpretability
TL;DR: The paradigm of theory-guided data science is formally conceptualized and a taxonomy of research themes in TGDS is presented and several approaches for integrating domain knowledge in different research themes are described using illustrative examples from different disciplines. (via Semantic Scholar)
UN Sustainable Development Goal Categories
13. Climate Action (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2017 article

Towards Automatic Linkage of Knowledge Worker's Claims with Associated Evidence from Screenshots

2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), pp. 17–22.

By: P. Jones n, D. Medd n, S. Ramakrishnan n, R. Shah n, J. Keyton n & N. Samatova n

TL;DR: An alternative approach to instrumentation based on automated analysis of desktop screenshots is proposed, and this in the context of extraction of 'claims' from reports that users are writing, and association of these claims with 'evidence' obtained from web browsing. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2016 article

AMRZone: A Runtime AMR Data Sharing Framework For Scientific Applications

2016 16TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), pp. 116–125.

By: W. Zhang n, H. Tang n, S. Harenberg n, S. Byna*, X. Zou n, D. Devendran*, D. Martin*, K. Wu* ...

TL;DR: AMRZone's performance and scalability are even comparable with existing state-of-the-art work when tested over uniform mesh data with up to 16384 cores, in the best case, the framework achieves a 46% performance improvement. (via Semantic Scholar)
UN Sustainable Development Goal Categories
4. Quality Education (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2016 chapter

Causality-Guided Feature Selection

In Advanced Data Mining and Applications (pp. 391–405).

By: M. Chaudhary n, D. Gonzalez n, G. Bello n, M. Angus n, D. Desai n, S. Harenberg n, P. Doraiswamy*, F. Semazzi n, V. Kumar*, N. Samatova*

TL;DR: This work proposes a causality-guided feature selection methodology that identifies factors having a potential cause-effect relationship in complex systems, and selects features by clustering them based on their causal strength with respect to the response, and validate the proposed methodology for predicting response in five real-world datasets. (via Semantic Scholar)
UN Sustainable Development Goal Categories
13. Climate Action (OpenAlex)
Source: Crossref
Added: February 24, 2020

2016 chapter

Community Detection in Dynamic Attributed Graphs

In Advanced Data Mining and Applications (pp. 329–344).

By: G. Bello n, S. Harenberg n, A. Agrawal n & N. Samatova*

TL;DR: The results obtained show that the proposed algorithm is able to identify graph partitions of high modularity and high attribute similarity more efficiently than state-of-the-art methods for community detection. (via Semantic Scholar)
Source: Crossref
Added: February 24, 2020

2016 conference paper

Exploring memory hierarchy and network topology for runtime AMR data sharing across scientific applications

2016 IEEE International Conference on Big Data (Big Data), 1359–1366.

TL;DR: Results show the proposed framework's spatial access pattern detection and prefetching methods demonstrate about 26% performance improvement for client analytical processes and the framework's topology-aware data placement can improve overall data access performance by up to 18%. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2016 article

In situ Storage Layout Optimization for AMR Spatio-temporal Read Accesses

PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, pp. 406–415.

By: H. Tang n, S. Byna*, S. Harenberg n, W. Zhang n, X. Zou n, D. Martin*, B. Dong*, D. Devendran* ...

TL;DR: This work develops an in situ data layout optimization framework that automatically selects from a set of candidate layouts based on a performance model, and reorganizes the data before writing to storage to enable efficient AMR read accesses. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: August 6, 2018

2016 chapter

Knowledge-Guided Maximal Clique Enumeration

In Advanced Data Mining and Applications (pp. 604–618).

By: S. Harenberg n, R. Seay n, G. Bello n, R. Chirkova n, P. Doraiswamy* & N. Samatova n

TL;DR: The problem of knowledge-biased clique enumeration is introduced, a query-driven formulation that reduces output space, computation time, and memory usage, and a dynamic state space indexing strategy for efficiently processing multiple queries over the same graph is introduced. (via Semantic Scholar)
Source: Crossref
Added: February 24, 2020

2016 article

Usage Pattern-Driven Dynamic Data Layout Reorganization

2016 16TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), pp. 356–365.

By: H. Tang n, S. Byna*, S. Harenberg n, X. Zou n, W. Zhang n, K. Wu*, B. Dong*, O. Rubel* ...

TL;DR: This work proposes a framework that dynamically recognizes the data usage patterns, replicates the data of interest in multiple reorganized layouts that would benefit common read patterns, and makes runtime decisions on selecting a favorable layout for a given read pattern. (via Semantic Scholar)
UN Sustainable Development Goal Categories
4. Quality Education (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2015 review

Anomaly detection in dynamic networks: a survey

[Review of ]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 7(3), 223–247.

author keywords: anomaly detection; dynamic networks; outlier detection; graph mining; dynamic network anomaly detection; network anomaly detection
TL;DR: This work focuses on anomaly detection in static graphs, which do not change and are capable of representing only a single snapshot of data, but as real‐world networks are constantly changing, there has been a shift in focus to dynamic graphs,Which evolve over time. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2015 article

Exploring Memory Hierarchy to Improve Scientific Data Read Performance

2015 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING - CLUSTER 2015, pp. 66–69.

By: W. Zhang n, H. Tang n, X. Zou n, S. Harenberg n, Q. Liu n, S. Klasky n, N. Samatova n

author keywords: scientific data; read contention; memory hierarchy; SSD
TL;DR: This paper proposes a framework that exploits the memory hierarchy resource to address the read contention issues involved with SSDs and achieves up to 50% read performance improvement when tested on datasets from real-world scientific simulations. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: August 6, 2018

2015 journal article

On size-constrained minimum s-t cut problems and size-constrained dense subgraph problems

THEORETICAL COMPUTER SCIENCE, 609, 434–442.

By: W. Chen*, N. Samatova n, M. Stallmann n, W. Hendrix n & W. Ying*

author keywords: At-least-k-subgraph problem; At-most-k-subgraph problem; Approximation algorithm; The minimum s-t cut with at-least-k vertices problem; The minimum s-t cut with at-most-k vertices problem; The minimum s-t cut with exactly k vertices problem
TL;DR: The minimum s-t cut with at-least-k vertices problem, the minimum s -t cutWith at-most-k-subgraph problem, and the Minimum s-T cut with exactly k vertices problems are introduced and it is proved that they are NP-complete. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2015 journal article

On the data-driven inference of modulatory networks in climate science: an application to West African rainfall

NONLINEAR PROCESSES IN GEOPHYSICS, 22(1), 33–46.

By: D. Gonzalez n, M. Angus n, I. Tetteh n, G. Bello n, K. Padmanabhan n, S. Pendse n, S. Srinivas n, J. Yu n ...

Source: Web Of Science
Added: August 6, 2018

2015 article

Parallel In Situ Detection of Connected Components in Adaptive Mesh Refinement Data

2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, pp. 302–312.

By: X. Zou n, K. Wu*, D. Boyuka n, D. Martin*, S. Byna*, H. Tang n, K. Bansal n, T. Ligocki*, H. Johansen*, N. Samatova n

TL;DR: The first connected component detection methodology for structured AMR that is applicable in a parallel, in situ context is presented, incorporating an multi-phase AMR-aware communication pattern that synchronizes connectivity information across the AMR hierarchy. (via Semantic Scholar)
UN Sustainable Development Goal Categories
9. Industry, Innovation and Infrastructure (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2015 chapter

Response-Guided Community Detection: Application to Climate Index Discovery

In Machine Learning and Knowledge Discovery in Databases (pp. 736–751).

By: G. Bello n, M. Angus n, N. Pedemane n, J. Harlalka n, F. Semazzi n, V. Kumar*, N. Samatova*

author keywords: Community detection; Spatiotemporal data; Climate index discovery; Seasonal rainfall prediction
TL;DR: This work proposes a general strategy for response-guided community detection that explicitly incorporates information of the response variable during the community detection process, and introduces a graph representation of spatiotemporal data that leverages information from multiple variables. (via Semantic Scholar)
UN Sustainable Development Goal Categories
13. Climate Action (OpenAlex)
Source: Crossref
Added: February 24, 2020

2015 article

The Hyperdyadic Index and Generalized Indexing and Query with PIQUE

PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT.

By: D. Boyuka n, H. Tang n, K. Bansal n, X. Zou n, S. Klasky* & N. Samatova n

TL;DR: PIQUE factors out commonalities in indexing, employing algorithmic/data structure "plugins" to mix orthogonal indexing concepts such as FastBit compressed bitmaps with ALACRITY binning, all within one framework. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2014 journal article

An efficient algorithm for pairwise local alignment of protein interaction networks

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 13(2).

author keywords: Network alignment; conserved functional modules; graph optimization; graph theory
MeSH headings : Algorithms; Animals; Computational Biology; Conserved Sequence; Gene Ontology / statistics & numerical data; Humans; Protein Interaction Mapping / statistics & numerical data; Protein Interaction Maps; Sequence Alignment / statistics & numerical data
TL;DR: The problem of identifying conserved patterns of protein interaction networks as a graph optimization problem is formulated, and a fast heuristic algorithm for this problem is developed that discovers conserved modules with a larger number of proteins in an order of magnitude less time. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: August 6, 2018

2014 journal article

Community detection in large-scale networks: a survey and empirical evaluation

Wiley Interdisciplinary Reviews: Computational Statistics, 6(6), 426–439.

By: S. Harenberg n, G. Bello n, L. Gjeltema n, S. Ranshous n, J. Harlalka n, R. Seay n, K. Padmanabhan n, N. Samatova n

author keywords: clustering; community detection; empirical evaluation; graphs; ground-truth; networks
TL;DR: This review evaluated eight state‐of‐the‐art and five traditional algorithms for overlapping and disjoint community detection on large‐scale real‐world networks with known ground‐truth communities and showed that these two types of metrics are not equivalent. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2014 journal article

DIRAQ: scalable in situ data- and resource-aware indexing for optimized query performance

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 17(4), 1101–1119.

By: S. Lakshminarasimhan n, X. Zou n, D. Boyuka n, S. Pendse n, J. Jenkins n, V. Vishwanath*, M. Papka*, S. Klasky*, N. Samatova n

author keywords: Exascale computing; Indexing; Query processing; Compression
TL;DR: DIRAQ is proposed, a parallel in situ, in network data encoding and reorganization technique that enables the transformation of simulation output into a query-efficient form, with negligible runtime overhead to the simulation run. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2014 journal article

Different Modes of Variability over the Tasman Sea: Implications for Regional Climate

JOURNAL OF CLIMATE, 27(22), 8466–8486.

By: S. Liess*, A. Kumar*, P. Snyder*, J. Kawale*, K. Steinhaeuser*, F. Semazzi n, A. Ganguly*, N. Samatova n, V. Kumar*

author keywords: Australia; Southern Ocean; Annular mode; ENSO; Teleconnections; Drought
UN Sustainable Development Goal Categories
13. Climate Action (Web of Science)
14. Life Below Water (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2014 chapter

Fast Set Intersection through Run-Time Bitmap Construction over PForDelta-Compressed Indexes

In Lecture Notes in Computer Science (pp. 668–679).

By: X. Zou n, S. Lakshminarasimhan*, D. Boyuka n, S. Ranshous n, H. Tang n, S. Klasky*, N. Samatova n

TL;DR: This work has shown that the recently-presented PForDelta-compressed index has been demonstrated to be storage-lightweight, but has limited performance for set intersection. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2014 chapter

Improving Read Performance with Online Access Pattern Analysis and Prefetching

In Lecture Notes in Computer Science (pp. 246–257).

By: H. Tang n, X. Zou n, J. Jenkins n, D. Boyuka n, S. Ranshous n, D. Kimpe*, S. Klasky*, N. Samatova*

TL;DR: This work proposes an online analyzer capable of detecting both simple and complex access patterns with low computational and memory overhead and high accuracy, and consistently observes run-time reductions across 18 configurations of PIOBench and 4 configurations of a micro-benchmark with both structured and unstructured access patterns. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2014 chapter

RADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication

In Lecture Notes in Computer Science (pp. 296–313).

By: J. Jenkins n, X. Zou n, H. Tang n, D. Kimpe*, R. Ross* & N. Samatova*

TL;DR: This work presents a partial data replication system called RADAR, which stores all replica data and metadata, along with the original untouched data, under a single file container using the object abstraction in parallel filesystems. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2014 journal article

Solving the maximum duo-preservation string mapping problem with linear programming

THEORETICAL COMPUTER SCIENCE, 530, 1–11.

By: W. Chen*, Z. Chen n, N. Samatova n, L. Peng*, J. Wang* & M. Tang*

author keywords: Approximation algorithm; Constrained maximum induced subgraph problem; Duo-preservation string mapping; Linear programming; Integer programming; Randomized rounding
TL;DR: The maximum duo-preservation string mapping problem (MPSM), which is complementary to the minimum common string partition problem (MCSP), is introduced and it is shown that both CMIS and CNIS are NP-complete. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2014 journal article

Theory-Guided Data Science for Climate Change

COMPUTER, 47(11), 74–78.

author keywords: data analysis; discovery analytics; data mining; big data; scientific computing; theory-guided data science; climate change
TL;DR: To adequately address climate change, the authors need novel data-science methods that account for the spatiotemporal and physical nature of climate phenomena to move from statistical analysis to scientific insights. (via Semantic Scholar)
UN Sustainable Development Goal Categories
13. Climate Action (Web of Science; OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2014 article

Transparent In Situ Data Transformations in ADIOS

2014 14TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), pp. 256–266.

By: D. Boyuka n, S. Lakshminarasimhan*, X. Zou n, Z. Gong n, J. Jenkins n, E. Schendel n, N. Podhorszki*, Q. Liu*, S. Klasky*, N. Samatova n

TL;DR: This work develops an in situ data transformation framework in the ADIOS I/O middleware with a "plug in" interface, thus greatly simplifying both the deployment and use of data transform services in scientific applications. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: August 6, 2018

2013 conference paper

A generic high-performance method for deinterleaving scientific data

Euro-par 2013 parallel processing, 8097, 571–582.

TL;DR: To the best of the knowledge, this is the first deinterleaving method that exploits data cache prefetching, reduces memory accesses, and optimizes the use of complete cache line writes. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: NC State University Libraries
Added: August 6, 2018

2013 journal article

A graph-based approach to find teleconnections in climate data

Statistical Analysis and Data Mining, 6(3), 158–179.

author keywords: graph algorithm; teleconnections; dipole discovery
TL;DR: A systematic graph‐based approach to find the teleconnections in climate data is presented, which can generate a single snapshot picture of all the dipole interconnections on the globe in a given dataset and thus makes it possible to study the changes in dipole interactions and movements. (via Semantic Scholar)
UN Sustainable Development Goal Categories
13. Climate Action (Web of Science; OpenAlex)
Source: Crossref
Added: August 28, 2020

2013 chapter

ALACRITY: Analytics-Driven Lossless Data Compression for Rapid In-Situ Indexing, Storing, and Querying

In Transactions on Large-Scale Data- and Knowledge-Centered Systems X (pp. 95–114).

By: J. Jenkins n, I. Arkatkar n, S. Lakshminarasimhan n, D. Boyuka n, E. Schendel n, N. Shah n, S. Ethier*, C. Chang* ...

Source: Crossref
Added: August 28, 2020

2013 journal article

Complex biomarker discovery in neuroimaging data: Finding a needle in a haystack

NeuroImage: Clinical, 3, 123–131.

By: G. Atluri*, K. Padmanabhan n, G. Fang*, M. Steinbach*, J. Petrella*, K. Lim*, A. MacDonald*, N. Samatova n, P. Doraiswamy*, V. Kumar*

TL;DR: The nature of complex biomarkers being investigated in the recent literature is considered and techniques to find such biomarkers that have been developed in related areas of data mining, statistics, machine learning and bioinformatics are presented. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2013 article

Coupled Heterogeneous Association Rule Mining (CHARM): Application toward Inference of Modulatory Climate Relationships

2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), pp. 1055–1060.

By: D. Gonzalez n, S. Pendse n, K. Padmanabhan n, M. Angus n, I. Tetteh n, S. Srinivas n, A. Villanes n, F. Semazzi n, V. Kumar*, N. Samatova n

author keywords: association rules; climate; data coupling; discovery
TL;DR: Coupled Heterogeneous Association Rule Mining (CHARM), a computationally efficient methodology that mines higher-order relationships between these subsystems' anomalous temporal phases with respect to their effect on the system's response, is presented. (via Semantic Scholar)
UN Sustainable Development Goal Categories
13. Climate Action (Web of Science; OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2013 journal article

Global Alignment of Pairwise Protein Interaction Networks for Maximal Common Conserved Patterns

INTERNATIONAL JOURNAL OF GENOMICS, 2013.

By: W. Tian* & N. Samatova n

TL;DR: This work introduces a connected-components based fast algorithm, HopeMap, for network alignment, which is fast with linear computational cost, highly accurate in terms of KO and GO terms specificity and sensitivity, and can be extended to multiple alignments easily. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2013 journal article

Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 26(7), 1453–1473.

By: Q. Liu*, J. Logan*, Y. Tian*, H. Abbasi*, N. Podhorszki*, J. Choi*, S. Klasky*, R. Tchoua* ...

author keywords: high performance computing; high performance I; O; I; O middleware
TL;DR: The startling observations made in the last half decade of I/O research and development are described, and some of the challenges that remain as the coming Exascale era are detailed. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2013 article

PARLO: PArallel Run-time Layout Optimization for Scientific Data Explorations with Heterogeneous Access Patterns

PROCEEDINGS OF THE 2013 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID 2013), pp. 343–351.

By: Z. Gong n, D. Boyuka n, X. Zou n, Q. Liu*, N. Podhorszki*, S. Klasky*, X. Ma n, N. Samatova n

Source: Web Of Science
Added: August 6, 2018

2013 journal article

Processing MPI Derived Datatypes on Noncontiguous GPU-Resident Data

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 25(10), 2627–2637.

By: J. Jenkins n, J. Dinan*, P. Balaji*, T. Peterka*, N. Samatova n & R. Thakur*

author keywords: MPI; graphics processing unit; CUDA; datatype
TL;DR: This work utilizes a kernel on the GPU to pack arbitrary noncontiguous GPU data by enriching the datatypes encoding to expose a fine-grained, data-point level of parallelism, and demonstrates the efficacy of kernel-based packing in various communication scenarios. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2012 chapter

Analytics-Driven Lossless Data Compression for Rapid In-situ Indexing, Storing, and Querying

In Lecture Notes in Computer Science (pp. 16–30).

By: J. Jenkins n, I. Arkatkar n, S. Lakshminarasimhan n, N. Shah n, E. Schendel n, S. Ethier*, C. Chang*, J. Chen* ...

TL;DR: This paper proposes a co-designed double-precision compression and indexing methodology for range queries by performing unique-value-based binning on the most significant bytes of double precision data, and inverting the resulting metadata to produce an inverted index over a reduced data representation. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2012 conference paper

Byte-precision level of detail processing for variable precision analytics

International conference for high performance computing networking.

By: J. Jenkins n, E. Schendel n, S. Lakshminarasimhan n, D. Boyuka n, T. Rogers n, S. Ethier*, R. Ross*, S. Klasky*, N. Samatova n

TL;DR: A precision level of detail (APLOD) library is developed, which partitions double-precision datasets along user-defined byte boundaries, and finds a strong applicability for the use of varying degrees of precision to reduce the cost of analyzing extreme-scale data. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2012 journal article

Discovery of extreme events-related communities in contrasting groups of physical system networks

DATA MINING AND KNOWLEDGE DISCOVERY, 27(2), 225–258.

By: Z. Chen n, W. Hendrix n, H. Guan*, I. Tetteh n, A. Choudhary*, F. Semazzi n, N. Samatova n

author keywords: Spatio-temporal data mining; Complex network analysis; Community detection; Comparative analysis; Networkmotif detection; Extreme event prediction
TL;DR: This paper forms a novel problem—detection of predictive and phase-biased communities in contrasting groups of networks, and proposes an efficient and effective machine learning solution for finding such anomalous communities. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2012 journal article

Functional Annotation of Hierarchical Modularity

PLOS ONE, 7(4).

By: K. Padmanabhan n, K. Wang n & N. Samatova n

MeSH headings : Algorithms; Cluster Analysis; Computational Biology / methods; Databases, Factual; Metabolic Networks and Pathways; Protein Interaction Mapping; Saccharomyces cerevisiae / genetics; Saccharomyces cerevisiae / metabolism; Saccharomyces cerevisiae Proteins / genetics; Saccharomyces cerevisiae Proteins / metabolism
TL;DR: The complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2012 article

Hierarchical Classifier-Regression Ensemble for Multi-Phase Non-Linear Dynamic System Response Prediction: Application to Climate Analysis

12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), pp. 781–788.

TL;DR: This paper proposes a hybrid approach that first predicts the phase the system is in, and then estimates the magnitude of the system's response using the regression model optimized for this phase, designed for systems that could be characterized by multi-variate spatio-temporal data from observations, simulations, or both. (via Semantic Scholar)
UN Sustainable Development Goal Categories
13. Climate Action (Web of Science)
Source: Web Of Science
Added: August 6, 2018

2012 journal article

ISABELA for effective in situ compression of scientific data

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 25(4), 524–540.

author keywords: lossy compression; B-spline; in situ processing; data-intensive application; high performance computing
TL;DR: The random nature of real‐valued scientific datasets renders lossless compression routines ineffective, and these techniques also impose significant overhead during decompression, making them unsuitable for data analysis and visualization, which require repeated data access. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2012 article

ISOBAR Preconditioner for Effective and High-throughput Lossless Data Compression

2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), pp. 138–149.

By: E. Schendel n, Y. Jin n, N. Shah n, J. Chen*, C. Chang*, S. Ku*, S. Ethier*, S. Klasky* ...

TL;DR: The In-Situ Orthogonal Byte Aggregate Reduction Compression (ISOBAR-compression) methodology is introduced as a preconditioner of loss less compression to identify and optimize the compression efficiency and throughput of hard-to-compress datasets. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2012 journal article

In-silico identification of phenotype-biased functional modules

PROTEOME SCIENCE, 10.

By: K. Padmanabhan n, K. Wilson*, A. Rocha*, K. Wang n, J. Mihelcic* & N. Samatova*

TL;DR: A methodology to identify phenotype-biased cellular subsystems that are more prone to occur in phenotype-expressing organisms than in phenotype non-expressing organisms is proposed and shown the effectiveness of the methodology by applying it to several target phenotypes. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2012 article

Multi-level Layout Optimization for Efficient Spatio-temporal Queries on ISABELA-compressed Data

2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), pp. 873–884.

By: Z. Gong n, S. Lakshminarasimhan n, J. Jenkins n, H. Kolla*, S. Ethier*, J. Chen*, R. Ross*, S. Klasky*, N. Samatova n

TL;DR: This work presents a parallel query-processing engine that can handle both range queries and queries with spatio-temporal constraints, on B-spline compressed data with user-controlled accuracy, and shows it to be efficient with respect to storage, computation, and I/O compared to existing database technologies optimized for query processing on scientific data. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: August 6, 2018

2012 journal article

NIBBS-Search for Fast and Accurate Prediction of Phenotype-Biased Metabolic Systems

PLOS COMPUTATIONAL BIOLOGY, 8(5).

MeSH headings : Algorithms; Animals; Computer Simulation; Data Mining / methods; Databases, Protein; Humans; Metabolome / physiology; Models, Biological; Periodicals as Topic; Phenotype; Protein Interaction Mapping / methods; Proteome / metabolism; Signal Transduction / physiology
TL;DR: Network Instance-Based Biased Subgraph Search (NIBBS) is a graph-theoretic method for genome-scale metabolic network comparative analysis that can identify metabolic systems that are statistically biased toward phenotype-expressing organismal networks. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2012 conference paper

On the path to sustainable, scalable, and energy-efficient data analytics: Challenges, promises, and future directions

2012 International Green Computing Conference (IGCC).

TL;DR: This paper proposes a number of future directions that could be pursued on the path to sustainable data analytics at scale, including transformative approaches to efficient data reduction, analytics-driven query processing, scalable analytical kernels, approximate analytics, among others. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2011 journal article

Community-based anomaly detection in evolutionary networks

Journal of Intelligent Information Systems, 39(1), 59–85.

By: Z. Chen n, W. Hendrix n & N. Samatova*

TL;DR: This work develops a parameter-free and scalable algorithm using a proposed representative-based technique to detect all six possible types of community-based anomalies: grown, shrunken, merged, split, born, and vanished communities, and detail the underlying theory required to guarantee the correctness of the algorithm. (via Semantic Scholar)
UN Sustainable Development Goal Categories
13. Climate Action (OpenAlex)
Source: NC State University Libraries
Added: August 6, 2018

2011 chapter

Compressing the Incompressible with ISABELA: In-situ Reduction of Spatio-temporal Data

In Euro-Par 2011 Parallel Processing (pp. 366–379).

By: S. Lakshminarasimhan n, N. Shah n, S. Ethier*, S. Klasky*, R. Latham*, R. Ross*, N. Samatova*

TL;DR: This work proposes an effective method for In-situ Sort-And-B-spline Error-bounded Lossy Abatement (ISABELA) of scientific data that is widely regarded as effectively incompressible and significantly outperforms existing lossy compression methods, such as Wavelet compression. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2011 journal article

DENSE: efficient and prior knowledge-driven discovery of phenotype-associated protein functional modules

BMC SYSTEMS BIOLOGY, 5.

MeSH headings : Animals; Binding Sites; Cattle; Cell Movement; Cells, Cultured; Computer Simulation; Extracellular Matrix / metabolism; Fibronectins / metabolism; Humans; Models, Biological; Neuropilin-1 / metabolism; Pancreatic Elastase / metabolism; Phenotype; Rats; Receptors, Vascular Endothelial Growth Factor / metabolism; Signal Transduction; Systems Biology; Vascular Endothelial Growth Factor A / chemistry; Vascular Endothelial Growth Factor A / metabolism; Vascular Endothelial Growth Factor A / physiology
TL;DR: A fast and theoretically guranteed method called DENSE (Dense and ENriched Subgraph Enumeration) that can take in as input a biologist's prior knowledge as a set of query proteins and identify all the dense functional modules in a biological network that contain some part of the query vertices is introduced. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2011 journal article

Efficient alpha, beta-motif finder for identification of phenotype-related functional modules

BMC BIOINFORMATICS, 12.

MeSH headings : Acids / metabolism; Algorithms; Bacteria / genetics; Bacteria / metabolism; Citric Acid Cycle; Computing Methodologies; Hydrogen / metabolism; Phenotype; Proteobacteria
TL;DR: A methodology that can identify potential statistically significant phenotype-related functional modules that are in at least α networks of phenotype-expressing organisms but appear in no more than β networks of organisms that do not exhibit the target phenotype is proposed. (via Semantic Scholar)
UN Sustainable Development Goal Categories
6. Clean Water and Sanitation (OpenAlex)
Source: Web Of Science
Added: August 6, 2018

2011 journal article

Group analysis of the thin film dewetting equation

INTERNATIONAL JOURNAL OF NON-LINEAR MECHANICS, 47(1), 9–13.

By: S. Meleshko*, N. Samatova n & A. Melechko n

author keywords: Thin film dewetting; Admitted Lie group; Invariant solution
UN Sustainable Development Goal Categories
Source: Web Of Science
Added: August 6, 2018

2011 chapter

Lessons Learned from Exploring the Backtracking Paradigm on the GPU

In Euro-Par 2011 Parallel Processing (pp. 425–437).

By: J. Jenkins n, I. Arkatkar n, J. Owens*, A. Choudhary* & N. Samatova*

TL;DR: The backtracking paradigm with properties seen as sub-optimal for GPU architectures is explored, using as a case study the maximal clique enumeration problem, and it is found that the presence of these properties limit GPU performance to approximately 1.4-2.25 times a single CPU core. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2011 journal article

Quantitative proteomic analyses of the response of acidophilic microbial communities to different pH conditions

The ISME Journal, 5(7), 1152–1161.

By: C. Belnap*, C. Pan*, V. Denef*, N. Samatova*, R. Hettich* & J. Banfield*

author keywords: acid mine drainage; communities; genotyping; perturbation; proteomics
MeSH headings : Acids / analysis; Bacteria / genetics; Bacteria / growth & development; Bacteria / metabolism; Bacterial Proteins / analysis; Biofilms; Genotype; Hydrogen-Ion Concentration; Proteome / analysis; Proteome / metabolism; Proteomics / methods
TL;DR: The results confirm the importance of pH and related geochemical factors in fine-tuning acidophilic microbial community structure and function at the species and strain level, and demonstrate the broad utility of proteomics in laboratory community studies. (via Semantic Scholar)
UN Sustainable Development Goal Categories
15. Life on Land (OpenAlex)
Source: Crossref
Added: August 28, 2020

2010 journal article

A high-throughput de novo sequencing approach for shotgun proteomics using high-resolution tandem mass spectrometry

BMC Bioinformatics, 11.

By: C. Pan, B. Park, W. McDonald, P. Carey, J. Banfield, N. VerBerkmoes, R. Hettich, N. Samatova

Source: NC State University Libraries
Added: August 6, 2018

2010 journal article

Coordinating Computation and I/O in Massively Parallel Sequence Search

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 22(4), 529–543.

By: H. Lin*, X. Ma n, W. Feng* & N. Samatova n

author keywords: Scheduling; parallel I/O; bioinformatics; parallel genomic sequence search; BLAST
TL;DR: This study reveals that the lack of coordination between computation scheduling and I/O optimization could result in severe performance issues, and proposes an integrated scheduling approach that effectively improves sequence-search throughput by gracefully coordinating the dynamic load balancing of computation and high-performance noncontiguous I/o. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2010 conference paper

Detecting and Tracking Community Dynamics in Evolutionary Networks

The 10th IEEE International Conference on Data Mining Workshops, 318–327.

By: Z. Chen n, K. Wilson n, Y. Jin n & N. Samatova n

TL;DR: This paper proposes an efficient method for detecting and tracking community dynamics in evolutionary networks by introducing graph representatives and community representatives to avoid generating redundant communities and limit the search space. (via Semantic Scholar)
UN Sustainable Development Goal Categories
2. Zero Hunger (OpenAlex)
Source: NC State University Libraries
Added: August 6, 2018

2010 journal article

Theoretical underpinnings for maximal clique enumeration on perturbed graphs

THEORETICAL COMPUTER SCIENCE, 411(26-28), 2520–2536.

By: W. Hendrix n, M. Schmidt n, P. Breimyer n & N. Samatova n

author keywords: Graph perturbation theory; Maximal clique enumeration; Graph algorithms; Uncertain and noisy data
TL;DR: By enumerating only the difference set between the baseline and perturbed graphs' MCEs, the computational cost of enumerating the maximal cliques of the perturbed graph can be reduced. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2010 journal article

Transparent runtime parallelization of the R scripting language

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 71(2), 157–168.

By: J. Li*, X. Ma n, S. Yoginath*, G. Kora* & N. Samatova n

author keywords: Runtime parallelization; Incremental analysis; Scripting languages
Source: Web Of Science
Added: August 6, 2018

2009 journal article

A scalable, parallel algorithm for maximal clique enumeration

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 69(4), 417–428.

By: M. Schmidt n, N. Samatova n, K. Thomas & B. Park*

author keywords: Maximal clique enumeration; Parallel graph algorithms; High-performance computing; Dynamic load balancing; Biological networks; Cray XT
TL;DR: This paper proposes a parallel, scalable, and memory-efficient MCE algorithm for distributed and/or shared memory high performance computing architectures, whose runtime scales linearly for thousands of processors on real-world application graphs with hundreds and thousands of nodes. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2009 conference paper

BioDEAL: community generation of biological annotations

BMC Medical Informatics and Decision Making, 9.

By: P. Breimyer, N. Green, V. Kumar & N. Samatova

Source: NC State University Libraries
Added: August 6, 2018

2009 article

Compressed ion temperature gradient turbulence in diverted tokamak edge

Chang, C. S., Ku, S., Diamond, P. H., Lin, Z., Parker, S., Hahm, T. S., & Samatova, N. (2009, May). PHYSICS OF PLASMAS, Vol. 16.

By: C. Chang n, S. Ku n, P. Diamond n, Z. Lin n, S. Parker n, T. Hahm n, N. Samatova n

author keywords: plasma boundary layers; plasma density; plasma instability; plasma simulation; plasma toroidal confinement; plasma transport processes; plasma turbulence; Tokamak devices
Source: Web Of Science
Added: August 6, 2018

2009 journal article

Cultivation and quantitative proteomic analyses of acidophilic microbial communities

The ISME Journal, 4(4), 520–530.

By: C. Belnap*, C. Pan*, N. VerBerkmoes*, M. Power*, N. Samatova*, R. Carver*, R. Hettich*, J. Banfield*

author keywords: proteomics; acid mine drainage; biofilm
MeSH headings : Acids / metabolism; Archaea / growth & development; Archaea / metabolism; Bacteria / growth & development; Bacteria / metabolism; Biofilms / growth & development; Environmental Microbiology; Iron / metabolism; Oxidation-Reduction; Proteome / analysis
TL;DR: The research presented here represents the first description of the application of a metabolic labeling-based quantitative proteomic analysis at the community level and resulted in a model microbial community system ideal for testing physiological and ecological hypotheses. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2009 conference paper

Fast matching for all pairs similarity search

2009 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), vol 1, 295–300.

By: A. Awekar n & N. Samatova*

TL;DR: This work proposes fast matching technique that uses the sparse nature of real-world data to effectively reduce the size of the search space through a systematic set of tighter filtering conditions and heuristic optimizations. (via Semantic Scholar)
Source: NC State University Libraries
Added: August 6, 2018

2009 journal article

Improved genome annotation for Zymomonas mobilis

Nature Biotechnology, 27(10), 893–894.

By: S. Yang*, K. Pappas*, L. Hauser*, M. Land*, G. Chen*, G. Hurst*, C. Pan*, V. Kouvelis* ...

MeSH headings : Databases, Genetic; Genome, Bacterial; Models, Genetic; Sequence Analysis, DNA; Terminology as Topic; Zymomonas / genetics
TL;DR: An overview of the extensive changes made to the ZM4 chromosome based upon mass-spectrometry proteomics and pyrosequencing data and six illustrative examples are presented. (via Semantic Scholar)
UN Sustainable Development Goal Categories
2. Zero Hunger (OpenAlex)
Source: Crossref
Added: August 28, 2020

2009 journal article

On parameterized complexity of the Multi-MCS problem

THEORETICAL COMPUTER SCIENCE, 410(21-23), 2024–2032.

By: W. Chen n, M. Schmidt n & N. Samatova n

author keywords: Algorithms; Maximum common subgraph; Parameterized complexity; Linear FPT reduction
TL;DR: The maximum common subgraph problem for multiple graphs (Multi-MCS) inspired by various biological applications such as multiple alignments of gene sequences, protein structures, metabolic pathways, or protein-protein interaction networks is introduced. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2009 journal article

On the parameterized complexity of the Multi-MCT and Multi-MCST problems

JOURNAL OF COMBINATORIAL OPTIMIZATION, 21(2), 151–158.

By: W. Chen n, M. Schmidt n & N. Samatova n

author keywords: Multi-MCT; Multi-MCST; W-hierarchy; Parameterized complexity; Computational complexity
TL;DR: This paper proves parameterized complexity hardness results for the different parameterized versions of the Multi-MCT and Multi-MCST problem under isomorphic embeddings. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2008 chapter

Adaptive Request Scheduling for Parallel Scientific Web Services

In Lecture Notes in Computer Science (pp. 276–294).

By: H. Lin n, X. Ma n, J. Li n, T. Yu n & N. Samatova*

TL;DR: This paper systematically investigates adaptive scheduling for scientific web services, by taking into account parallel computation scalability, data locality, and load balancing, and presents several dynamic scheduling techniques that automatically adapt to the request workload and system configuration in making scheduling decisions. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2008 journal article

From pull-down data to protein interaction networks and complexes with biological relevance

BIOINFORMATICS, 24(7), 979–986.

MeSH headings : Algorithms; Biology / methods; Computer Simulation; Databases, Protein; Gene Expression Profiling / methods; Information Storage and Retrieval / methods; Models, Biological; Peptide Mapping / methods; Protein Interaction Mapping / methods; Proteins / chemistry; Proteins / metabolism; Signal Transduction / physiology; Structure-Activity Relationship; Systems Integration
TL;DR: A multi-step but easy-to-follow framework for identifying protein complexes from MS pull-down data assesses interaction affinity between two proteins based on similarity of their co-purification patterns derived from MS data and constructs a protein interaction network by adopting a knowledge-guided threshold selection method. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2008 chapter

The Buffered Work-Pool Approach for Search-Tree Based Optimization Algorithms

In Parallel Processing and Applied Mathematics (pp. 170–179).

By: F. Abu-Khzam*, M. Rizk*, D. Abdallah* & N. Samatova*

TL;DR: A load balancing strategy is presented that could exploit multi-core architectures, such as clusters of symmetric multiprocessors, and the well-known Maximum Clique problem is used as an exemplar to illustrate the utility of this approach. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2007 journal article

Characterization of anaerobic catabolism of p-coumarate in Rhodopseudomonas palustris by integrating transcriptomics and quantitative proteomics

MOLECULAR & CELLULAR PROTEOMICS, 7(5), 938–948.

MeSH headings : Anaerobiosis / genetics; Bacterial Proteins / analysis; Bacterial Proteins / genetics; Bacterial Proteins / metabolism; Benzoates / metabolism; Coumaric Acids / metabolism; Gene Expression Profiling; Protein Biosynthesis / genetics; Proteomics; RNA, Messenger / analysis; RNA, Messenger / metabolism; Rhodopseudomonas / genetics; Rhodopseudomonas / growth & development; Rhodopseudomonas / metabolism; Succinic Acid / metabolism
TL;DR: The integrated gene expression data provided strong support for the non-β-oxidation route in R. palustris, consistent with the hypothesis that p-coumarate is converted to benzoyl-CoA, which is then degraded via a known aromatic ring reduction pathway. (via Semantic Scholar)
Source: Web Of Science
Added: August 6, 2018

2007 journal article

Sampling streaming data with replacement

Computational Statistics & Data Analysis, 52(2), 750–762.

By: B. Park*, G. Ostrouchov* & N. Samatova*

author keywords: data stream mining; random sampling with replacement; reservoir sampling
TL;DR: A with-replacement reservoir sampling algorithm of sub-linear time complexity is introduced and a thorough complexity analysis of several approaches to the with- Replacement reservoir sampling problem is provided. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2006 journal article

Detecting Differential and Correlated Protein Expression in Label-Free Shotgun Proteomics

Journal of Proteome Research, 5(11), 2909–2918.

By: B. Zhang*, N. VerBerkmoes*, M. Langston*, E. Uberbacher*, R. Hettich* & N. Samatova*

author keywords: label-free; LC-MS/MS; shotgun proteomics; differential expression; correlated expression; clustering; Saccharomyces cerevisiae; Rhodopseudomonas palustris
MeSH headings : Bacterial Proteins / genetics; Chromatography, Liquid; Gene Expression; Mass Spectrometry; Proteins / chemistry; Proteins / genetics; Proteins / isolation & purification; Proteomics / methods; Reproducibility of Results; Rhodopseudomonas / genetics
TL;DR: A systematic analysis of various approaches to quantifying differential protein expression in eukaryotic Saccharomyces cerevisiae and prokaryotic Rhodopseudomonas palustris label-free LC-MS/MS data demonstrated that proteins co-located in the same operon were much more strongly coexpressed than those from different operons. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2006 journal article

Gene network shaping of inherent noise spectra

Nature, 439(7076), 608–611.

By: D. Austin, M. Allen, J. McCollum*, R. Dar, J. Wilgus*, G. Sayler*, N. Samatova*, C. Cox*, M. Simpson*

MeSH headings : Algorithms; Computer Simulation; Escherichia coli / cytology; Escherichia coli / genetics; Escherichia coli / growth & development; Escherichia coli Proteins / genetics; Gene Expression Regulation, Bacterial; Genes, Bacterial / genetics; Half-Life; Microscopy, Fluorescence; Models, Genetic; Regulatory Sequences, Nucleic Acid / genetics; Stochastic Processes
TL;DR: Noise spectral measurements provide mechanistic insights into gene regulation, as perturbations of gene circuit parameters are discernible in the measured noise frequency ranges, and suggest that noise spectral measurements could facilitate the discovery of novel regulatory relationships. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2006 journal article

ProRata:  A Quantitative Proteomics Program for Accurate Protein Abundance Ratio Estimation with Confidence Interval Evaluation

Analytical Chemistry, 78(20), 7121–7131.

By: C. Pan*, G. Kora*, W. McDonald*, D. Tabb*, N. VerBerkmoes*, G. Hurst*, D. Pelletier*, N. Samatova*, R. Hettich*

MeSH headings : Algorithms; Bacterial Proteins / analysis; Bias; Confidence Intervals; Hot Temperature; Proteome; Proteomics / methods; Rhodopseudomonas; Software; Software Design
TL;DR: A profile likelihood algorithm is proposed for quantitative shotgun proteomics to infer the abundance ratio of proteins from the abundance ratios of isotopically labeled peptides derived from proteolysis and yields maximum likelihood point estimation and profile likelihood confidence interval estimation of protein abundance ratios. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2006 journal article

Robust Estimation of Peptide Abundance Ratios and Rigorous Scoring of Their Variability and Bias in Quantitative Shotgun Proteomics

Analytical Chemistry, 78(20), 7110–7120.

MeSH headings : Algorithms; Amino Acid Sequence; Bias; Chromatography; Hot Temperature; Ions; Molecular Sequence Data; Peptides / analysis; Peptides / chemistry; Proteomics / methods; Reproducibility of Results; Rhodopseudomonas; Tandem Mass Spectrometry
TL;DR: It is demonstrated that the profile signal-to-noise ratio is inversely correlated with the variability and bias of peptide abundance ratio estimation, and rigorously scored each abundance ratio for the expected estimation bias and variability. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2005 chapter

A New Approach and Faster Exact Methods for the Maximum Common Subgraph Problem

In Lecture Notes in Computer Science (pp. 717–727).

By: W. Suters*, F. Abu-Khzam*, Y. Zhang*, C. Symons*, N. Samatova* & M. Langston*

TL;DR: In this paper a new algorithm, termed “clique branching,” is proposed that exploits a special structure inherent in the association graph that contains a large number of naturally-ordered cliques that are present in the Association graph’s complement. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

2005 journal article

In silico Discovery of Enzyme–Substrate Specificity-determining Residue Clusters

Journal of Molecular Biology, 352(5), 1105–1117.

By: G. Yu*, B. Park*, P. Chandramohan*, R. Munavalli*, A. Geist* & N. Samatova*

author keywords: surface patch ranking; enzyme-substrate specificity-determining residues; sequence conservation; correlated mutations; mutagenesis
MeSH headings : Adenylyl Cyclases / physiology; Amino Acid Sequence; Amino Acids / chemistry; Amino Acids / physiology; Animals; Binding Sites / physiology; Cattle; Chymotrypsin / physiology; Computational Biology; Crystallography, X-Ray; Enzymes / chemistry; Enzymes / genetics; Enzymes / physiology; Guanylate Cyclase / physiology; L-Lactate Dehydrogenase / physiology; Malate Dehydrogenase / physiology; Molecular Sequence Data; Protein Structure, Tertiary; Substrate Specificity / physiology; Trypsin / chemistry; Trypsin / physiology
TL;DR: The results demonstrate that SPR may help the selection of target residues for mutagenesis experiments and, thus, focus rational drug design, protein engineering, and functional annotation to the relevant regions of a protein. (via Semantic Scholar)
UN Sustainable Development Goal Categories
15. Life on Land (OpenAlex)
Source: Crossref
Added: August 28, 2020

2005 journal article

The sorting direct method for stochastic simulation of biochemical systems with varying reaction execution behavior

Computational Biology and Chemistry, 30(1), 39–49.

By: J. McCollum*, G. Peterson*, C. Cox*, M. Simpson* & N. Samatova*

author keywords: stochastic simulation; modeling biochemical systems; Gillespie algorithm; gene networks; systems biology
MeSH headings : Algorithms; Aliivibrio fischeri / chemistry; Escherichia coli / chemistry; Models, Chemical; Stochastic Processes; Systems Biology / methods
TL;DR: This work examines the performance of different versions of Gillespie's stochastic simulation algorithm when applied to several biochemical models and proposes a new algorithm called the sorting direct method that maintains a loosely sorted order of the reactions as the simulation executes. (via Semantic Scholar)
Source: Crossref
Added: August 28, 2020

conference paper

ALACRITY: Analytics-driven lossless data compression for rapid in-situ indexing, storing, and querying

Jenkins, J., Arkatkar, I., Lakshminarasimhan, S., Boyuka, D. A., Schendel, E. R., Shah, N., … Samatova, N. F. Transactions on large-scale data- and knowledge- centered systems x: special issue on database- and expert-systems applications, 8220, 95–114.

By: J. Jenkins, I. Arkatkar, S. Lakshminarasimhan, D. Boyuka, E. Schendel, N. Shah, S. Ethier, C. Chang ...

Source: NC State University Libraries
Added: August 6, 2018

report

Anomaly detection in dynamic networks: A survey

Ranshous, S., Shen, S., Koutra, D., Faloutsos, C., & Samatova, N. F. In Technical Report- Not held in TRLN member libraries.

By: S. Ranshous, S. Shen, D. Koutra, C. Faloutsos & N. Samatova

Source: NC State University Libraries
Added: August 6, 2018

journal article

Characterizing gene and protein crosstalks in subjects at risk of developing Alzheimer's disease: A new computational approach

Padmanabhan, K., Nudelman, K., Harenberg, S., Bello, G., Sohn, D., Shpanskaya, K., … Samatova, N. F. Processes, 5(3).

By: K. Padmanabhan, K. Nudelman, S. Harenberg, G. Bello, D. Sohn, K. Shpanskaya, P. Dikshit, P. Yerramsetty ...

Source: NC State University Libraries
Added: August 6, 2018

report

Community detection in large-scale networks: A Survey and empirical evaluation

Harenberg, S., Bello, G. A., Gjeltema, L., Ranshous, S., Harlalka, J., Seay, R., … Samatova, N. In Technical Report- Not held in TRLN member libraries (p. 2014).

By: S. Harenberg, G. Bello, L. Gjeltema, S. Ranshous, J. Harlalka, R. Seay, K. Padmanabhan, N. Samatova

Source: NC State University Libraries
Added: August 6, 2018

report

Compressing the Incompressible with ISABELA: In-situ Reduction of Spatio-Temporal Data

Lakshminarasimhan, S., Shah, N., Ethier, S. J., Klasky, S., Latham, R., Ross, R., & N.F., S. In Technical Report- Not held in TRLN member libraries.

By: S. Lakshminarasimhan, N. Shah, S. Ethier, S. Klasky, R. Latham, R. Ross, S. N.F.

Source: NC State University Libraries
Added: August 6, 2018

conference paper

Coupling graph perturbation theory with scalable parallel algorithms for large-scale enumeration of maximal cliques in biological graphs - art. no. 012053

Samatova, N. F., Schmidt, M. C., Hendrix, W., Breimyer, P., Thomas, K., & Park, B. H. Scidac 2008: Scientific discovery through advanced computing, 125, 12053–12053.

By: N. Samatova, M. Schmidt, W. Hendrix, P. Breimyer, K. Thomas & B. Park

Source: NC State University Libraries
Added: August 6, 2018

report

Forecaster: Forecast Oriented Feature Elimination-based Classification of Adverse Spatio-Temporal Extremes

Chen, Z., Pansombut, T., Hendrix, W., Gonzalez, D., Semazzi, F., Choudhary, A., … Samatova, N. F.

By: Z. Chen, T. Pansombut, W. Hendrix, D. Gonzalez, F. Semazzi, A. Choudhary, V. Kumar, A. Melechko, N. Samatova

Source: NC State University Libraries
Added: August 6, 2018

report

Parallel data layout optimization of scientific data through access-driven replication

Jenkins, J. P., Zou, X., Tang, H., Kimpe, D., Ross, R., & Samatova, N. F. In Technical Report- Not held in TRLN member libraries.

By: J. Jenkins, X. Zou, H. Tang, D. Kimpe, R. Ross & N. Samatova

Source: NC State University Libraries
Added: August 6, 2018

Citation Index includes data from a number of different sources. If you have questions about the sources of data in the Citation Index or need a set of data which is free to re-distribute, please contact us.

Certain data included herein are derived from the Web of Science© and InCites© (2024) of Clarivate Analytics. All rights reserved. You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.