References of "Samhi, Jordan 50035256"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailSensitive and Personal Data: What Exactly Are You Talking About?
Kober, Maria; Samhi, Jordan UL; Arzt, Steven et al

in 10th International Conference on Mobile Software Engineering and Systems 2023 (2023, May)

Mobile devices are pervasively used for a variety of tasks, including the processing of sensitive data in mobile apps. While in most cases access to this data is legitimate, malware often targets ... [more ▼]

Mobile devices are pervasively used for a variety of tasks, including the processing of sensitive data in mobile apps. While in most cases access to this data is legitimate, malware often targets sensitive data and even benign apps collect more data than necessary for their task. Therefore, researchers have proposed several frameworks to detect and track the use of sensitive data in apps, so as to disclose and prevent unauthorized access and data leakage. Unfortunately, a review of the literature reveals a lack of consensus on what sensitive data is in the context of technical frameworks like Android. Authors either provide an intuitive definition or an ad-hoc definition, derive their definition from the Android permission model, or rely on previous research papers which do or do not give a definition of sensitive data. In this paper, we provide an overview of existing definitions of sensitive data in literature and legal frameworks. We further provide a sound definition of sensitive data derived from the definition of personal data of several legal frameworks. To help the scientific community further advance in this field, we publicly provide a list of sensitive sources from the Android framework, thus starting a community project leading to a complete list of sensitive API methods across different frameworks and programming languages. [less ▲]

Detailed reference viewed: 61 (0 UL)
Peer Reviewed
See detailLes nouvelles tendances en matière de sécurité pour les serveurs Linux
Samhi, Jordan UL

Article for general public (2023)

Detailed reference viewed: 30 (1 UL)
Peer Reviewed
See detailLes plateformes de bug bounty
Samhi, Jordan UL

Article for general public (2023)

Detailed reference viewed: 33 (4 UL)
Peer Reviewed
See detailLes web shells : tout ce que vous devez savoir
Samhi, Jordan UL

Article for general public (2023)

Detailed reference viewed: 30 (1 UL)
Full Text
Peer Reviewed
See detailNegative Results of Fusing Code and Documentation for Learning to Accurately Identify Sensitive Source and Sink Methods An Application to the Android Framework for Data Leak Detection
Samhi, Jordan UL; Kober, Kober; Kabore, Abdoul Kader UL et al

in 30th IEEE International Conference on Software Analysis, Evolution and Reengineering (2023, March)

Apps on mobile phones manipulate all sorts of data, including sensitive data, leading to privacy related concerns. Recent regulations like the European GDPR provide rules for the processing of personal ... [more ▼]

Apps on mobile phones manipulate all sorts of data, including sensitive data, leading to privacy related concerns. Recent regulations like the European GDPR provide rules for the processing of personal and sensitive data, like that no such data may be leaked without the consent of the user. Researchers have proposed sophisticated approaches to track sensitive data within mobile apps, all of which rely on specific lists of sensitive source and sink methods. The data flow analysis results greatly depend on these lists' quality. Previous approaches either used incomplete hand-written lists and quickly became outdated or relied on machine learning. The latter, however, leads to numerous false positives, as we show. This paper introduces CoDoC that aims to revive the machine-learning approach to precisely identify the privacy-related source and sink API methods. In contrast to previous approaches, CoDoC uses deep learning techniques and combines the source code with the documentation of API methods. Firstly, we propose novel definitions that clarify the concepts of taint analysis, source, and sink methods. Secondly, based on these definitions, we build a new ground truth of Android methods representing sensitive source, sink, and neither methods that will be used to train our classifier. We evaluate CoDoC and show that, on our validation dataset, it achieves a precision, recall, and F1 score of 91%, outperforming the state-of-the-art SuSi. However, similarly to existing tools, we show that in the wild, i.e., with unseen data, CoDoC performs poorly and generates many false-positive results. Our findings suggest that machine-learning models for abstract concepts such as privacy fail in practice despite good lab results. To encourage future research, we release all our artifacts to the community. [less ▲]

Detailed reference viewed: 110 (10 UL)
Peer Reviewed
See detailUSB Drop Attacks
Samhi, Jordan UL

Article for general public (2023)

Detailed reference viewed: 50 (2 UL)
Peer Reviewed
See detailCryptojacking
Samhi, Jordan UL

Article for general public (2023)

Detailed reference viewed: 47 (3 UL)
Full Text
See detailAnalyzing the Unanalyzable: an Application to Android Apps
Samhi, Jordan UL

Doctoral thesis (2023)

In general, software is unreliable. Its behavior can deviate from users’ expectations because of bugs, vulnerabilities, or even malicious code. Manually vetting software is a challenging, tedious, and ... [more ▼]

In general, software is unreliable. Its behavior can deviate from users’ expectations because of bugs, vulnerabilities, or even malicious code. Manually vetting software is a challenging, tedious, and highly-costly task that does not scale. To alleviate excessive costs and analysts’ burdens, automated static analysis techniques have been proposed by both the research and practitioner communities making static analysis a central topic in software engineering. In the meantime, mobile apps have considerably grown in importance. Today, most humans carry software in their pockets, with the Android operating system leading the market. Millions of apps have been proposed to the public so far, targeting a wide range of activities such as games, health, banking, GPS, etc. Hence, Android apps collect and manipulate a considerable amount of sensitive information, which puts users’ security and privacy at risk. Consequently, it is paramount to ensure that apps distributed through public channels (e.g., the Google Play) are free from malicious code. Hence, the research and practitioner communities have put much effort into devising new automated techniques to vet Android apps against malicious activities over the last decade. Analyzing Android apps is, however, challenging. On the one hand, the Android framework proposes constructs that can be used to evade dynamic analysis by triggering the malicious code only under certain circumstances, e.g., if the device is not an emulator and is currently connected to power. Hence, dynamic analyses can -easily- be fooled by malicious developers by making some code fragments difficult to reach. On the other hand, static analyses are challenged by Android-specific constructs that limit the coverage of off-the-shell static analyzers. The research community has already addressed some of these constructs, including inter-component communication or lifecycle methods. However, other constructs, such as implicit calls (i.e., when the Android framework asynchronously triggers a method in the app code), make some app code fragments unreachable to the static analyzers, while these fragments are executed when the app is run. Altogether, many apps’ code parts are unanalyzable: they are either not reachable by dynamic analyses or not covered by static analyzers. In this manuscript, we describe our contributions to the research effort from two angles: ① statically detecting malicious code that is difficult to access to dynamic analyzers because they are triggered under specific circumstances; and ② statically analyzing code not accessible to existing static analyzers to improve the comprehensiveness of app analyses. More precisely, in Part I, we first present a replication study of a state-of-the-art static logic bomb detector to better show its limitations. We then introduce a novel hybrid approach for detecting suspicious hidden sensitive operations towards triaging logic bombs. We finally detail the construction of a dataset of Android apps automatically infected with logic bombs. In Part II, we present our work to improve the comprehensiveness of Android apps’ static analysis. More specifically, we first show how we contributed to account for atypical inter-component communication in Android apps. Then, we present a novel approach to unify both the bytecode and native in Android apps to account for the multi-language trend in app development. Finally, we present our work to resolve conditional implicit calls in Android apps to improve static and dynamic analyzers. [less ▲]

Detailed reference viewed: 156 (18 UL)
Full Text
Peer Reviewed
See detailDemystifying Hidden Sensitive Operations in Android apps
Sun, Xiaoyu; Chen, Xiao; Li, Li et al

in ACM Transactions on Software Engineering and Methodology (2022)

Detailed reference viewed: 45 (3 UL)
Peer Reviewed
See detailAnalyse Statique et Automatisée de Code
Samhi, Jordan UL

Article for general public (2022)

Detailed reference viewed: 37 (6 UL)
Full Text
Peer Reviewed
See detailTriggerZoo: A Dataset of Android Applications Automatically Infected with Logic Bombs
Samhi, Jordan UL; Bissyande, Tegawendé François D Assise UL; Klein, Jacques UL

in 19th International Conference on Mining Software Repositories, Data Showcase, (MSR 2022) (2022, May 23)

Many Android apps analyzers rely, among other techniques, on dynamic analysis to monitor their runtime behavior and detect potential security threats. However, malicious developers use subtle, though ... [more ▼]

Many Android apps analyzers rely, among other techniques, on dynamic analysis to monitor their runtime behavior and detect potential security threats. However, malicious developers use subtle, though efficient, techniques to bypass dynamic analyzers. Logic bombs are examples of popular techniques where the malicious code is triggered only under specific circumstances, challenging comprehensive dynamic analyses. The research community has proposed various approaches and tools to detect logic bombs. Unfortunately, rigorous assessment and fair comparison of state-of-the-art techniques are impossible due to the lack of ground truth. In this paper, we present TriggerZoo, a new dataset of 406 Android apps containing logic bombs and benign trigger-based behavior that we release only to the research community using authenticated API. These apps are real-world apps from Google Play that have been automatically infected by our tool AndroBomb. The injected pieces of code implementing the logic bombs cover a large pallet of realistic logic bomb types that we have manually characterized from a set of real logic bombs. Researchers can exploit this dataset as ground truth to assess their approaches and provide comparisons against other tools. [less ▲]

Detailed reference viewed: 110 (11 UL)
Full Text
Peer Reviewed
See detailDifuzer: Uncovering Suspicious Hidden Sensitive Operations in Android Apps
Samhi, Jordan UL; Li, Li; Bissyande, Tegawendé François D Assise UL et al

in 44th International Conference on Software Engineering (ICSE 2022) (2022, May 21)

One prominent tactic used to keep malicious behavior from being detected during dynamic test campaigns is logic bombs, where malicious operations are triggered only when specific conditions are satisfied ... [more ▼]

One prominent tactic used to keep malicious behavior from being detected during dynamic test campaigns is logic bombs, where malicious operations are triggered only when specific conditions are satisfied. Defusing logic bombs remains an unsolved problem in the literature. In this work, we propose to investigate Suspicious Hidden Sensitive Operations (SHSOs) as a step towards triaging logic bombs. To that end, we develop a novel hybrid approach that combines static analysis and anomaly detection techniques to uncover SHSOs, which we predict as likely implementations of logic bombs. Concretely, Difuzer identifies SHSO entry-points using an instrumentation engine and an inter-procedural data-flow analysis. Then, it extracts trigger-specific features to characterize SHSOs and leverages One-Class SVM to implement an unsupervised learning model for detecting abnormal triggers. We evaluate our prototype and show that it yields a precision of 99.02% to detect SHSOs among which 29.7% are logic bombs. Difuzer outperforms the state-of-the-art in revealing more logic bombs while yielding less false positives in about one order of magnitude less time. All our artifacts are released to the community. [less ▲]

Detailed reference viewed: 91 (12 UL)
Full Text
Peer Reviewed
See detailJuCify: A Step Towards Android Code Unification for Enhanced Static Analysis
Samhi, Jordan UL; Gao, Jun UL; Daoudi, Nadia UL et al

in 44th International Conference on Software Engineering (ICSE 2022) (2022, May 21)

Native code is now commonplace within Android app packages where it co-exists and interacts with Dex bytecode through the Java Native Interface to deliver rich app functionalities. Yet, state-of-the-art ... [more ▼]

Native code is now commonplace within Android app packages where it co-exists and interacts with Dex bytecode through the Java Native Interface to deliver rich app functionalities. Yet, state-of-the-art static analysis approaches have mostly overlooked the presence of such native code, which, however, may implement some key sensitive, or even malicious, parts of the app behavior. This limitation of the state of the art is a severe threat to validity in a large range of static analyses that do not have a complete view of the executable code in apps. To address this issue, we propose a new advance in the ambitious research direction of building a unified model of all code in Android apps. The JuCify approach presented in this paper is a significant step towards such a model, where we extract and merge call graphs of native code and bytecode to make the final model readily-usable by a common Android analysis framework: in our implementation, JuCify builds on the Soot internal intermediate representation. We performed empirical investigations to highlight how, without the unified model, a significant amount of Java methods called from the native code are ``unreachable'' in apps' call-graphs, both in goodware and malware. Using JuCify, we were able to enable static analyzers to reveal cases where malware relied on native code to hide invocation of payment library code or of other sensitive code in the Android framework. Additionally, JuCify's model enables state-of-the-art tools to achieve better precision and recall in detecting data leaks through native code. Finally, we show that by using JuCify we can find sensitive data leaks that pass through native code. [less ▲]

Detailed reference viewed: 135 (21 UL)
Full Text
Peer Reviewed
See detailOn The (In)Effectiveness of Static Logic Bomb Detector for Android Apps
Samhi, Jordan UL; Bartel, Alexandre UL

in IEEE Transactions on Dependable and Secure Computing (2021)

Android is present in more than 85% of mobile devices, making it a prime target for malware. Malicious code is becoming increasingly sophisticated and relies on logic bombs to hide itself from dynamic ... [more ▼]

Android is present in more than 85% of mobile devices, making it a prime target for malware. Malicious code is becoming increasingly sophisticated and relies on logic bombs to hide itself from dynamic analysis. In this paper, we perform a large scale study of TSO PEN, our open-source implementation of the state-of-the-art static logic bomb scanner T RIGGER S COPE, on more than 500k Android applications. Results indicate that the approach scales. Moreover, we investigate the discrepancies and show that the approach can reach a very low false-positive rate, 0.3%, but at a particular cost, e.g., removing 90% of sensitive methods. Therefore, it might not be realistic to rely on such an approach to automatically detect all logic bombs in large datasets. However, it could be used to speed up the location of malicious code, for instance, while reverse engineering applications. We also present T RIGDB a database of 68 Android applications containing trigger-based behavior as a ground-truth to the research community. [less ▲]

Detailed reference viewed: 59 (6 UL)
Peer Reviewed
See detailLes dangers de pastebin
Samhi, Jordan UL; Bissyande, Tegawendé François D Assise UL; Klein, Jacques UL

Article for general public (2021)

Detailed reference viewed: 118 (16 UL)
Full Text
Peer Reviewed
See detailRAICC: Revealing Atypical Inter-Component Communication in Android Apps
Samhi, Jordan UL; Bartel, Alexandre UL; Bissyande, Tegawendé François D Assise UL et al

in 43rd International Conference on Software Engineering (ICSE) (2021, May)

Inter-Component Communication (ICC) is a key mechanism in Android. It enables developers to compose rich functionalities and explore reuse within and across apps. Unfortunately, as reported by a large ... [more ▼]

Inter-Component Communication (ICC) is a key mechanism in Android. It enables developers to compose rich functionalities and explore reuse within and across apps. Unfortunately, as reported by a large body of literature, ICC is rather "complex and largely unconstrained", leaving room to a lack of precision in apps modeling. To address the challenge of tracking ICCs within apps, state of the art static approaches such as Epicc, IccTA and Amandroid have focused on the documented framework ICC methods (e.g., startActivity) to build their approaches. In this work we show that ICC models inferred in these state of the art tools may actually be incomplete: the framework provides other atypical ways of performing ICCs. To address this limitation in the state of the art, we propose RAICC a static approach for modeling new ICC links and thus boosting previous analysis tasks such as ICC vulnerability detection, privacy leaks detection, malware detection, etc. We have evaluated RAICC on 20 benchmark apps, demonstrating that it improves the precision and recall of uncovered leaks in state of the art tools. We have also performed a large empirical investigation showing that Atypical ICC methods are largely used in Android apps, although not necessarily for data transfer. We also show that RAICC increases the number of ICC links found by 61.6% on a dataset of real-world malicious apps, and that RAICC enables the detection of new ICC vulnerabilities. [less ▲]

Detailed reference viewed: 138 (33 UL)
Full Text
Peer Reviewed
See detailA First Look at Android Applications in Google Play related to Covid-19
Samhi, Jordan UL; Allix, Kevin UL; Bissyande, Tegawendé François D Assise UL et al

in Empirical Software Engineering (2021)

Due to the convenience of access-on-demand to information and business solutions, mobile apps have become an important asset in the digital world. In the context of the Covid-19 pandemic, app developers ... [more ▼]

Due to the convenience of access-on-demand to information and business solutions, mobile apps have become an important asset in the digital world. In the context of the Covid-19 pandemic, app developers have joined the response effort in various ways by releasing apps that target different user bases (e.g., all citizens or journalists), offer different services (e.g., location tracking or diagnostic-aid), provide generic or specialized information, etc. While many apps have raised some concerns by spreading misinformation or even malware, the literature does not yet provide a clear landscape of the different apps that were developed. In this study, we focus on the Android ecosystem and investigate Covid-related Android apps. In a best-effort scenario, we attempt to systematically identify all relevant apps and study their characteristics with the objective to provide a First taxonomy of Covid related apps, broadening the relevance beyond the implementation of contact tracing. Overall, our study yields a number of empirical insights that contribute to enlarge the knowledge on Covid-related apps: (1) Developer communities contributed rapidly to the Covid-19, with dedicated apps released as early as January 2020; (2) Covid-related apps deliver digital tools to users (e.g., health diaries), serve to broadcast information to users (e.g., spread statistics), and collect data from users (e.g., for tracing); (3) Covid-related apps are less complex than standard apps; (4) they generally do not seem to leak sensitive data; (5) in the majority of cases, Covid-related apps are released by entities with past experience on the market, mostly official government entities or public health organizations. [less ▲]

Detailed reference viewed: 119 (40 UL)
Full Text
Peer Reviewed
See detailDexRay: A Simple, yet Effective Deep Learning Approach to Android Malware Detection Based on Image Representation of Bytecode
Daoudi, Nadia UL; Samhi, Jordan UL; Kabore, Abdoul Kader UL et al

in Communications in Computer and Information Science (2021)

Computer vision has witnessed several advances in recent years, with unprecedented performance provided by deep representation learning research. Image formats thus appear attractive to other fields such ... [more ▼]

Computer vision has witnessed several advances in recent years, with unprecedented performance provided by deep representation learning research. Image formats thus appear attractive to other fields such as malware detection, where deep learning on images alleviates the need for comprehensively hand-crafted features generalising to different malware variants. We postulate that this research direction could become the next frontier in Android malware detection, and therefore requires a clear roadmap to ensure that new approaches indeed bring novel contributions. We contribute with a first building block by developing and assessing a baseline pipeline for image-based malware detection with straightforward steps. We propose DexRay, which converts the bytecode of the app DEX files into grey-scale “vector” images and feeds them to a 1-dimensional Convolutional Neural Network model. We view DexRay as foundational due to the exceedingly basic nature of the design choices, allowing to infer what could be a minimal performance that can be obtained with image-based learning in malware detection. The performance of DexRay evaluated on over 158k apps demonstrates that, while simple, our approach is effective with a high detection rate(F1-score= 0.96). Finally, we investigate the impact of time decay and image-resizing on the performance of DexRay and assess its resilience to obfuscation. This work-in-progress paper contributes to the domain of Deep Learning based Malware detection by providing a sound, simple, yet effective approach (with available artefacts) that can be the basis to scope the many profound questions that will need to be investigated to fully develop this domain. [less ▲]

Detailed reference viewed: 146 (26 UL)
Peer Reviewed
See detailDésamorcer des bombes logiques
Samhi, Jordan UL; Bartel, Alexandre UL

Article for general public (2020)

Detailed reference viewed: 79 (19 UL)